US9804973B1 - Using frequency domain to prioritize storage of metadata in a cache - Google Patents

Using frequency domain to prioritize storage of metadata in a cache Download PDF

Info

Publication number
US9804973B1
US9804973B1 US14/939,693 US201514939693A US9804973B1 US 9804973 B1 US9804973 B1 US 9804973B1 US 201514939693 A US201514939693 A US 201514939693A US 9804973 B1 US9804973 B1 US 9804973B1
Authority
US
United States
Prior art keywords
addresses
frequency
cache
metadata
region
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
US14/939,693
Inventor
Ori Shalev
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Pure Storage Inc
Original Assignee
Pure Storage Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Pure Storage Inc filed Critical Pure Storage Inc
Priority to US14/939,693 priority Critical patent/US9804973B1/en
Assigned to PURE STORAGE, INC. reassignment PURE STORAGE, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: SHALEV, ORI
Priority to US15/682,699 priority patent/US10191857B1/en
Application granted granted Critical
Publication of US9804973B1 publication Critical patent/US9804973B1/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/12Replacement control
    • G06F12/121Replacement control using replacement algorithms
    • G06F12/126Replacement control using replacement algorithms with special data handling, e.g. priority of data or instructions, handling errors or pinning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0866Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches for peripheral storage systems, e.g. disk cache
    • G06F12/0871Allocation or management of cache space
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0891Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches using clearing, invalidating or resetting means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/12Replacement control
    • G06F12/121Replacement control using replacement algorithms
    • G06F12/122Replacement control using replacement algorithms of the least frequently used [LFU] type, e.g. with individual count value
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/12Replacement control
    • G06F12/121Replacement control using replacement algorithms
    • G06F12/123Replacement control using replacement algorithms with age lists, e.g. queue, most recently used [MRU] list or least recently used [LRU] list
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • G06F12/023Free address space management
    • G06F12/0253Garbage collection, i.e. reclamation of unreferenced memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0866Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches for peripheral storage systems, e.g. disk cache
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/12Replacement control
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/12Replacement control
    • G06F12/121Replacement control using replacement algorithms
    • G06F12/126Replacement control using replacement algorithms with special data handling, e.g. priority of data or instructions, handling errors or pinning
    • G06F12/127Replacement control using replacement algorithms with special data handling, e.g. priority of data or instructions, handling errors or pinning using additional replacement algorithms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/10Providing a specific technical effect
    • G06F2212/1016Performance improvement
    • G06F2212/1021Hit rate improvement
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/31Providing disk cache in a specific location of a storage system
    • G06F2212/313In storage device
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/46Caching storage objects of specific type in disk cache
    • G06F2212/466Metadata, control data
    • G06F2212/69
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/70Details relating to dynamic memory management

Definitions

  • This invention relates to a storage system, and more particularly to caching metadata in a storage system.
  • Storage systems often store large amounts of data and process a variety of different workloads from various numbers of clients. These storage systems typically have non-volatile storage devices which are used to store client data, and volatile memory to cache metadata used for locating the client data. As the amount of data increases, so does the amount of metadata, and determining which metadata to store in the cache(s) becomes more challenging.
  • Storage virtualization provides an abstraction of logical storage from physical storage in order to access logical storage without end-users identifying physical storage.
  • the logical storage may be accessed via a logical address space, with a volume and block number of a given request being used to generate an address within the logical address space.
  • a volume manager performs input/output (I/O) redirection by translating incoming I/O requests using logical addresses from end-users into new requests using addresses associated with physical locations in the storage devices.
  • I/O input/output
  • some storage devices include additional address translation mechanisms, such as address translation layers which may be used in solid state storage devices, the translation from a logical address to another address may not be the only or final address translation.
  • Redirection utilizes metadata stored in one or more mapping tables.
  • information stored in the one or more mapping tables may be used for storage deduplication
  • a data storage subsystem may be coupled to a network, and the data storage subsystem may receive read and write requests via the network from one or more client computers.
  • the data storage subsystem may include a plurality of data storage locations on a device group including a plurality of storage devices.
  • the data storage subsystem may also include one or more mapping tables storing a plurality of entries for translating logical addresses of received requests to physical addresses corresponding to data storage locations. Rather than storing the entirety of the mapping table(s) in the device group, portions of the mapping table may be stored in a cache for faster access, allowing some lookups to be performed more efficiently with fewer accesses to the storage devices.
  • the mapping table(s) may be organized into pages, with each page storing a plurality of entries. Portions of the mapping table may be added and evicted from the cache in page size allocation units. In other embodiments, other allocation unit sizes may be chosen.
  • a typical storage system may process a variety of different types of data workloads. Some of the workloads may have random access patterns while other workloads may have more predictable access patterns. As metadata from these various workloads competes for cache space, it is challenging for the storage system to come up with efficient schemes for choosing which metadata to retain in the cache.
  • the storage system may include a storage controller, a cache, and a plurality of storage devices.
  • the storage controller may be configured to analyze the workloads that are being processed. In one embodiment, the storage controller may determine which workloads have random access patterns and which workloads have predictable access patterns. Also, the storage controller may identify metadata which corresponds to the workloads with random access patterns and identify which metadata corresponds to the workloads with predictable access patterns. The metadata associated with the random workloads may be evicted from the cache while the metadata associated with the predictable workloads may be retained in the cache.
  • a plurality of addresses corresponding to a plurality of input/output (I/O) accesses to the storage system may be captured, with the plurality of addresses targeting the logical address space of the storage system.
  • the logical address space may be partitioned into a plurality of regions, and the plurality of addresses may be sorted into a plurality of lists, with one list for each region of the logical address space.
  • the list For each list of captured addresses, the list may be transformed into a frequency domain representation to allow for spectral analysis of the frequency components of the access pattern to the corresponding region.
  • a Fourier-related transform may be utilized to generate the frequency domain representation of each list.
  • a score may be generated for each region based on the analysis of the corresponding frequency domain representation.
  • a cache replacement algorithm may utilize the generated scores to determine which pages in the cache to replace when new metadata needs to be loaded into the cache. The cache replacement algorithm may attempt to prevent metadata for workloads with random access patterns from kicking out metadata for workloads that have predictable access patterns.
  • a given frequency domain representation indicates the access pattern is a highly random access pattern
  • a low score may be given to the corresponding region. Any metadata pages containing address translations for this region may be assigned this low score when these metadata pages are stored in the cache.
  • a given frequency domain representation indicates the access pattern is a low random access pattern
  • a high score may be given to the corresponding region. This high score may be assigned to any metadata pages which have address translations for this region and which are stored in the cache.
  • the cache may retain metadata pages with high scores while evicting metadata pages with low scores.
  • Low random access patterns tend to correspond to accesses that will retarget the same region of the logical address space for future accesses. Accordingly, metadata pages corresponding to regions with low random access patterns are likely to be reused and the cache may attempt to retain metadata pages with high scores in the cache. Highly random access patterns tend to correspond to accesses that will not come back to the same region of the logical address space for future accesses. Therefore, metadata pages corresponding to regions with high random access patterns are not likely to be reused and the cache may attempt to evict metadata pages with low scores from the cache. In this way, the efficiency of the storage system will be improved by retaining metadata pages in the cache which are likely to be used again, resulting in fewer lookups to the storage devices for metadata.
  • FIG. 1 is a generalized block diagram illustrating one embodiment of a storage system.
  • FIG. 2 is a block diagram illustrating one embodiment of a mapping table.
  • FIG. 3 illustrates one embodiment of a storage controller.
  • FIG. 4 illustrates one embodiment of a listing of captured I/O accesses.
  • FIG. 5 illustrates one embodiment of a frequency domain representation of an I/O access listing.
  • FIG. 6 illustrates one embodiment of converting addresses of I/O accesses into a frequency domain representation.
  • FIG. 7 is a generalized flow diagram illustrating one embodiment of a method for assigning priorities to metadata stored in a cache.
  • FIG. 8 is a generalized flow diagram illustrating one embodiment of a method for measuring the randomness of access patterns to regions of a logical address space.
  • FIG. 9 is a generalized flow diagram illustrating one embodiment of a method for prioritizing metadata stored in a cache.
  • Configured To Various units, circuits, or other components may be described or claimed as “configured to” perform a task or tasks.
  • “configured to” is used to connote structure by indicating that the units/circuits/components include structure (e.g., circuitry) that performs the task or tasks during operation. As such, the unit/circuit/component can be said to be configured to perform the task even when the specified unit/circuit/component is not currently operational (e.g., is not on).
  • the units/circuits/components used with the “configured to” language include hardware—for example, circuits, memory storing program instructions executable to implement the operation, etc.
  • a unit/circuit/component is “configured to” perform one or more tasks is expressly intended not to invoke 35 U.S.C. ⁇ 112, sixth paragraph, for that unit/circuit/component.
  • “configured to” can include generic structure (e.g., generic circuitry) that is manipulated by software and/or firmware (e.g., an FPGA or a general-purpose processor executing software) to operate in a manner that is capable of performing the task(s) at issue.
  • “Configured to” may also include adapting a manufacturing process (e.g., a semiconductor fabrication facility) to fabricate devices (e.g., integrated circuits) that are adapted to implement or perform one or more tasks.
  • first,” “Second,” etc. are used as labels for nouns that they precede, and do not imply any type of ordering (e.g., spatial, temporal, logical, etc.).
  • first and second regions of a logical address space can be used to refer to any two regions.
  • this term is used to describe one or more factors that affect a determination. This term does not foreclose additional factors that may affect a determination. That is, a determination may be solely based on those factors or based, at least in part, on those factors.
  • a determination may be solely based on those factors or based, at least in part, on those factors.
  • Storage system 100 may include storage controller 110 and storage device groups 130 and 140 , which are representative of any number of storage device groups (or data storage arrays).
  • storage device group 130 includes storage devices 135 A-N, which are representative of any number and type of storage devices (e.g., solid-state drives (SSDs)).
  • Storage controller 110 may be coupled directly to client computer system 125 , and storage controller 110 may be coupled remotely over network 120 to client computer system 115 .
  • Clients 115 and 125 are representative of any number of clients which may utilize storage controller 110 for storing and accessing data in system 100 .
  • Storage controller 110 may include software and/or hardware configured to provide access to storage devices 135 A-N. Although storage controller 110 is shown as being separate from storage device groups 130 and 140 , in some embodiments, storage controller 110 may be located within one or each of storage device groups 130 and 140 . Storage controller 110 may include or be coupled to a base operating system (OS), a volume manager, and additional control logic for implementing the various techniques disclosed herein.
  • OS base operating system
  • volume manager volume manager
  • Storage controller 110 may include and/or execute on any number of processors and may include and/or execute on a single host computing device or be spread across multiple host computing devices, depending on the embodiment. In some embodiments, storage controller 110 may generally include or execute on one or more file servers and/or block servers. Storage controller 110 may run any type of OS (e.g., Windows®, Unix®, Linux®, Solaris®, MacOS®) depending on the embodiment.
  • OS e.g., Windows®, Unix®, Linux®, Solaris®, MacOS®
  • the number and type of clients, storage controllers, networks, storage device groups, and data storage devices is not limited to those shown in FIG. 1 .
  • the methods and mechanisms disclosed herein can be implemented in various networks and systems including computer systems, security systems, wireless networks, network architectures, data centers, operating systems, communication devices, and various other devices and systems.
  • Network 120 may utilize a variety of techniques including wireless connection, direct local area network (LAN) connections, wide area network (WAN) connections such as the Internet, a router, storage area network, Ethernet, and others.
  • Network 120 may comprise one or more LANs that may also be wireless.
  • Network 120 may further include remote direct memory access (RDMA) hardware and/or software, transmission control protocol/internet protocol (TCP/IP) hardware and/or software, router, repeaters, switches, grids, and/or others. Protocols such as Fibre Channel, Fibre Channel over Ethernet (FCoE), iSCSI, and so forth may be used in network 120 .
  • the network 120 may interface with a set of communications protocols used for the Internet such as the Transmission Control Protocol (TCP) and the Internet Protocol (IP), or TCP/IP.
  • TCP Transmission Control Protocol
  • IP Internet Protocol
  • Client computer systems 115 and 125 are representative of any number and type of stationary or mobile computers such as desktop personal computers (PCs), servers, server farms, workstations, laptops, handheld computers, servers, personal digital assistants (PDAs), smart phones, and so forth.
  • client computer systems 115 and 125 include one or more processors comprising one or more processor cores.
  • Each processor core includes circuitry for executing instructions according to a predefined general-purpose instruction set. For example, the x86 instruction set architecture may be selected. Alternatively, the ARM®, Alpha®, PowerPC®, SPARC®, or any other general-purpose instruction set architecture may be selected.
  • the processor cores may access cache memory subsystems for data and computer program instructions.
  • the cache subsystems may be coupled to a memory hierarchy comprising random access memory (RAM) and a storage device.
  • RAM random access memory
  • mapping tables may be used for I/O redirection or translation, deduplication of duplicate copies of user data, snapshot mappings, and so forth.
  • Mapping tables may be stored in the storage devices 135 A-N (of FIG. 1 ).
  • the diagram shown in FIG. 2 represents a logical representation of one embodiment of the organization and storage of the mapping table.
  • Each level shown may include mapping table entries corresponding to a different period of time. For example, level “1” may include information older than information stored in level “2”. Similarly, level “2” may include information older than information stored in level “3”. The information stored in the records, pages and levels shown in FIG.
  • mapping table entries 2 may be stored in a random-access manner within the storage devices 135 A-N. Additionally, copies of portions or all of a given mapping table entries may be stored in a random-access memory (RAM), in buffers within a storage controller, and/or in one or more caches for faster access.
  • RAM random-access memory
  • a corresponding index may be included in each level for mappings which are part of the level. Such an index may include an identification of mapping table entries and where they are stored (e.g., an identification of the page) within the level.
  • the index associated with mapping table entries may be a distinct entity, or entities, which are not logically part of the levels themselves.
  • each mapping table comprises a set of rows and columns.
  • a single record may be stored in a mapping table as a row.
  • a record may also be referred to as an entry.
  • a record stores at least one tuple including a key. Tuples may (or may not) also include data fields including data such as a pointer used to identify or locate data components stored in the storage subsystem.
  • the storage subsystem may include storage devices (e.g., SSDs) which have internal mapping mechanisms.
  • the pointer in the tuple may not be an actual physical address per se. Rather, the pointer may be a logical address which the storage device maps to a physical location within the device.
  • records in the mapping table may only contain key fields with no additional associated data fields. Attributes associated with a data component corresponding to a given record may be stored in columns, or fields, in the table. Status information, such as a valid indicator, a data age, a data size, and so forth, may be stored in fields, such as Field0 to FieldN shown in FIG. 2 .
  • a key is an entity in a mapping table that may distinguish one row of data from another row. Each row may also be referred to as an entry or a record. A key may be a single column, or it may consist of a group of columns used to identify a record.
  • an address translation mapping table may utilize a key comprising a volume identifier (ID), a logical or virtual address, a snapshot ID, a sector number, and so forth.
  • ID volume identifier
  • a given received read/write storage access request may identify a particular volume, sector and length.
  • a sector may be a logical block of data stored in a volume. Sectors may have different sizes on different volumes.
  • the address translation mapping table may map a volume in sector-size units.
  • a volume identifier (ID) along with a received sector number may be used to access the address translation mapping table. Therefore, in such an embodiment, the key value for accessing the address translation mapping table is the combination of the volume ID and the received sector number. In other embodiments, other values may be used to generate a key value. In one embodiment, the records within the address translation mapping table are sorted by key value.
  • the address translation mapping table may convey a physical pointer value that indicates a location within the data storage subsystem 170 storing a data component corresponding to the received data storage access request.
  • the key value may be compared to one or more key values stored in the mapping table. In the illustrated example, simpler key values, such as “0”, “2”, “12” and so forth, are shown for ease of illustration.
  • the physical pointer value may be stored in one or more of the fields in a corresponding record.
  • the physical pointer value may include a segment identifier (ID) and a physical address identifying the location of storage.
  • a segment may be a basic unit of allocation in each of the storage devices 135 A-N.
  • a segment may have a redundant array of independent device (RAID) level and a data type.
  • RAID redundant array of independent device
  • a segment may have one or more of the storage devices 135 A-N selected for corresponding storage.
  • a segment may be allocated an equal amount of storage space on each of the one or more selected storage devices of the storage devices 135 A-N.
  • the mapping table shown in FIG. 2 may be a deduplication table.
  • a deduplication table may utilize a key comprising a hash value determined from a data component associated with a storage access request.
  • the initial steps of a deduplication operation may be performed concurrently with other operations, such as a read/write request, a garbage collection operation, a trim operation, and so forth.
  • the data sent from one of the client computer systems may be a data stream, such as a byte stream.
  • a chunking algorithm may perform the dividing of the data stream into discrete data components which may be referred to as “chunks”.
  • a chunk may be a sub-file content-addressable unit of data.
  • the resulting chunks may then be stored in one of the data storage arrays 120 a - 120 b to allow for sharing of the chunks.
  • Such chunks may be stored separately or grouped together in various ways.
  • the chunks may be represented by a data structure that allows reconstruction of a larger data component from its chunks (e.g. a particular file may be reconstructed based on one or more smaller chunks of stored data).
  • a corresponding data structure may record its corresponding chunks including an associated calculated hash value, a pointer (physical and/or logical) to its location in a storage device 135 A-N, and its length.
  • a mapping table may comprise one or more levels as shown in FIG. 2 .
  • a mapping table may comprise 16 to 64 levels, although mapping tables with other numbers of levels are possible and contemplated.
  • FIG. 2 three levels labeled Level “1”, Level “2” and Level “N” are shown for ease of illustration.
  • Each level within a mapping table may include one or more partitions.
  • multiple levels within a mapping table are sorted by time. For example, in FIG. 2 , Level “1” may be older than Level “2”. Similarly, Level “2” may be older than Level “N”.
  • each partition is a 4 kilo-byte (KB) page.
  • Level “N” is shown to comprise pages 210 a - 210 g
  • Level “2” comprises pages 210 h - 210 j
  • Level “1” comprises pages 210 k - 210 n . It is possible and contemplated other partition sizes may also be chosen for each of the levels within a mapping table. In addition, it is possible one or more levels have a single partition, which is the level itself.
  • Storage controller 300 may include cache 305 , metadata frequency analyzer 310 , and processor(s) 315 .
  • Metadata frequency analyzer 310 may be implemented using any combination of hardware and/or software. It is noted that while metadata frequency analyzer 310 is shown separately from processor(s) 315 , portions or the entirety of metadata frequency analyzer 310 may be executed by processor(s) 315 . It is noted that storage controller 300 may also include other logic and components (e.g., network interface, RAM) which are not shown in FIG. 3 for ease of illustration. Storage controller 300 may also be coupled to one or more clients (not shown) and one or more storage devices (not shown).
  • Storage controller 300 may be configured to receive I/O requests targeting one or more storage devices of a storage system. Storage controller 300 may also be configured to process the received I/O requests by storing data at the targeted locations or retrieving data from the targeted locations. In order to locate the targeted locations, storage controller 300 may retrieve metadata corresponding to the logical addresses of the received I/O requests.
  • the metadata may include mapping table entries and/or index entries, with the mapping table entries including translations from the logical address space to the physical address space corresponding to the storage devices of the storage system.
  • Storage controller 300 may be configured to reduce the latency of I/O accesses targeting the one or more storage devices of a storage system.
  • One approach for reducing latency is to cache metadata so as to decrease the number of times the external storage devices are accessed.
  • Cache 305 may be configured to store metadata for the various applications being processed by the host storage system. In some embodiments, cache 305 may store both metadata and data. In other embodiments, cache 305 may store only metadata. Cache 305 may have any configuration (e.g., direct mapped or set associative).
  • Metadata is shown as being stored in cache 305 in page sized units (e.g., metadata page 325 A-B), with each page including a plurality of translation entries, it is noted that this is merely for illustrative purposes. In other embodiments, other unit sizes of metadata may be stored in cache 305 . For example, in another embodiment, individual translation entries may be allocated in cache 305 . The allocation size of metadata stored in cache 305 may also be referred to more generally as a “metadata grain”.
  • Metadata frequency analyzer 310 may be configured to perform a frequency analysis on the access patterns to the one or more storage devices of the storage system. Metadata frequency analyzer 310 may include any combination of hardware and/or software.
  • a plurality of received I/O accesses may be captured by storage controller 300 and provided as inputs to metadata frequency analyzer 310 . More specifically, the logical addresses of the received I/O accesses may be captured and logged into one or more lists.
  • each logical address may consist of a volume ID and a logical block address (LBA). The one or more lists may then be transformed from the logical address space domain to the frequency domain.
  • LBA logical block address
  • the logical address space may be treated as though it were the time domain when using a Fourier-related transform to transform the addresses into the frequency domain.
  • each access may be considered to have been received a fixed amount of time subsequent to the previous access.
  • the actual time the access was made will not be captured, but only the order in which the accesses were made will be retained.
  • metadata frequency analyzer 310 may receive as an input the address offsets of the I/O accesses in the logical address space. Then, metadata frequency analyzer 310 may convert these logical address offsets to the frequency domain. In one embodiment, metadata frequency analyzer 310 may use a Fourier transform, such as the discrete Fourier transform, to generate a frequency domain representation of the logical address offsets. In another embodiment, metadata frequency analyzer 310 may use a discrete cosine transform (DCT) to convert the addresses to the frequency domain. Using the DCT, metadata frequency analyzer 310 may convert the sequence of address values into a sum of cosine terms oscillating at different frequencies. In other embodiments, other types of tranforms (e.g., wavelet) may be used to convert the address offsets to the frequency domain.
  • DCT discrete cosine transform
  • Metadata frequency analyzer 310 may perform a spectral analysis of the generated frequency domain components. In one embodiment, if most of the energy in the frequency domain signal is located in the low frequency components, then metadata frequency analyzer 310 may identify these accesses as a predictable, low-random access pattern. If most of the energy in the frequency domain signal is located in the high frequency components, then metadata frequency analyzer 310 may identify these accesses as a highly-random access pattern.
  • the logical address space may be partitioned into a plurality of regions.
  • Metadata frequency analyzer 310 may utilize a scoring function formula to generate a score for the various regions of the logical address space. For example, if a first region which is servicing requests corresponds to a low-random access pattern as determined by the spectral analysis, then the first region may be given a high score. Any metadata pages which are stored in cache 305 and which correspond to the first region may be assigned the high score. If a second region corresponds to a highly random access pattern, then the second region may be given a low score. Any metadata pages which are stored in cache 305 and which correspond to the second region may be assigned the low score. As shown in FIG. 3 , metadata pages 325 A-B have been assigned scores 320 A-B, which may correspond to the scores assigned to their corresponding regions in the logical address space.
  • cache 305 may prioritize retaining metadata pages with a high score while attempting to evict metadata pages with a low score. It is noted that the assignment of scores may be reversed in other embodiments, such that highly random access pattern regions may be given a high score and low random access pattern regions may be given a low score. In these embodiments, cache 305 may prioritize retaining metadata pages with a low score while attempting to evict metadata pages with a high score. Any of various scoring functions may be utilized to generate a score for the various regions of the logical address space based on the corresponding frequency domain representations. For example, in one embodiment, an integral of the frequency domain representation may be calculated to generate a score for a given region.
  • Listing 400 may include the most recently detected I/O accesses to the storage devices in a storage system (e.g., storage system 100 of FIG. 1 ). The time period over which listing 400 was captured may vary depending on the embodiment. Also, listing 400 includes a number of accesses ‘N’, wherein ‘N’ is representative of any number of accesses, depending on the embodiment.
  • a storage controller may capture I/O accesses over a certain period of time. In another embodiment, the storage controller may start capturing I/O accesses and continue capturing I/O accesses until a certain threshold number of I/O accesses has been reached. The threshold number of accesses may vary depending on the embodiment.
  • each logical address of the access may be logged and stored in listing 400 . These addresses are shown starting with A1, which is followed by A2, A3, and so on until AN, which represents the logical address of the last captured access.
  • Listing 400 may be treated as though the access number were the x (or horizontal) axis and the logical address were the y (or vertical) axis.
  • Listing 400 may then be converted into a frequency domain representation using any of various transforms (e.g., Discrete Fourier Transform (DFT), DCT, wavelet transform). In one embodiment, the conversion to the frequency domain representation may be performed by assuming the access number is a time measurement and by assuming the logical address is an amplitude.
  • DFT Discrete Fourier Transform
  • the columns of listing 400 may be treated as though they were time (or sample number) and amplitude rather than access number and logical address, respectively. Therefore, the conversion to the frequency domain representation is straightforward and may be performed using any of various techniques well known to those skilled in the art.
  • listing 400 may be split up into multiple listings, and accesses may be categorized according to the region of the logical address space in which they are located. For example, if a logical address space is 4 gigabytes (GB) in size, then each 1 GB region of the logical address space may have its own listing. Any accesses that fall within the first GB of the logical address space may be stored in a first listing, accesses that fall within addresses 1 GB-2 GB may be stored in a second listing, and so on. In this way, a different frequency domain representation of each region may be generated and a score may be assigned to a metadata page based on the score of the region in which the metadata page is located.
  • GB gigabytes
  • the regions may all be the same size, as in the example described above with 1 GB size regions. However, in other embodiments, the regions may be different sizes, with some regions larger than other regions. For example, in one embodiment, an address space may be split up into 10 regions, with 6 of the regions equal in size at 1 GB, while 2 of the regions are of size 500 megabytes (MB), and the remaining 2 regions are of size 250 MB.
  • an address space may be split up into 10 regions, with 6 of the regions equal in size at 1 GB, while 2 of the regions are of size 500 megabytes (MB), and the remaining 2 regions are of size 250 MB.
  • the logical address space may be partitioned into regions prior to capturing addresses of I/O accesses.
  • the logical address space may be partitioned into regions after capturing addresses of I/O accesses.
  • the addresses may be analyzed prior to partitioning the logical address space into regions to determine how best to perform the partitioning. For example, if a large number of accesses are made to a particular area of the logical address space, then this area may be partitioned into smaller regions as compared to areas of the logical address space with small numbers of accesses. It is noted that other ways of partitioning the logical address space into regions are possible and are contemplated.
  • the captured addresses of I/O accesses may be assigned to their appropriate listings. Then, for each listing, the addresses in the logical address space may be converted into a frequency domain representation using any suitable transform (e.g., DFT, fast Fourier transform (FFT), DCT). A spectral analysis may be performed on each frequency domain representation so as to generate a score for the corresponding region in the logical address space. In one embodiment, if the frequency domain representation has mostly low frequency components, then a high score may be generated for the region. If the frequency domain representation has mostly high frequency components, then a low score may be generated for the region. Then, the metadata pages stored in the cache may be scored according to the score of the region to which they correspond. The cache may then utilize this score when determining which metadata pages to evict from the cache. The cache may attempt to evict metadata pages with a low score, corresponding to a region with mostly high frequency components.
  • DFT fast Fourier transform
  • DCT fast Fourier transform
  • frequency domain representation 500 is merely one example of a frequency domain representation after the addresses of a listing (e.g., listing 400 of FIG. 4 ) have been converted into the frequency domain.
  • Other frequency domain representations may have a different distribution of frequency components depending on the types of access patterns used to generate the addresses of the corresponding listings.
  • frequency domain representation 500 includes mostly high frequency components, which corresponds to a highly random access pattern for the addresses of the accesses in the corresponding listing.
  • a series of frequency bins may be used to represent frequency domain representation 500 .
  • the frequency bins may divide the total signal spectrum into equally spaced frequency ranges, and the size of each bin may vary according to the embodiment.
  • each frequency bin (F1, F2, etc.) shown on the horizontal axis may correspond to 1 kilohertz (kHz) of frequency range.
  • the vertical axis may measure the amplitude of the energy in each frequency bin, and the amplitude may be measured using any suitable unit.
  • energy as used in this context is meant to indicate that standard techniques for analyzing and measuring a frequency domain representation may be utilized. However, the term “energy” is not intended to suggest that the original addresses contain energy in the same manner of an electrical signal undergoing a frequency domain transformation.
  • the term “energy” may be defined as the numerical value of the frequency components in the frequency domain transformation.
  • the frequency component values in the frequency range from 0 to 1 kHz may be calculated and displayed above the frequency bin F1 in FIG. 5 .
  • the frequency component values in the frequency range from 1 kHz to 2 kHz is shown above frequency bin F2
  • the values in the frequency range from 2 kHz to 3 kHz is shown above frequency bin F3, and so on. It is noted that this is merely one example of a way to partition the total frequency range into bins for a particular spectral analysis. In other embodiments, other numbers of frequency bins may be utilized and the frequency bins may correspond to other sizes of frequency ranges.
  • the frequency domain representation may be analyzed using the discrete frequency components generated by the transformation from the corresponding address listing.
  • frequency domain representation 500 may be analyzed using other suitable techniques. For example, components at a predetermined percentage or decibel level above the average signal level may be identified, a peak signal amplitude level may be located, and/or any other suitable spectral analysis may be used to identify the type of access pattern which generated spectrum 500 .
  • a measure of randomness may be generated for frequency domain representation 500 based on a spectral analysis of the various frequency bins F1-F5. In one embodiment, the measure of randomness may then be compared to one or more thresholds to determine if the corresponding access pattern is a low random access pattern or high random access pattern. Metadata corresponding to a low random access pattern may be prioritized for retention in a cache while metadata corresponding to a high random access pattern may be evicted from the cache.
  • an integral of the frequency components of representation 500 may be computed in order to measure an amount of randomness in the corresponding access pattern, with the integral giving more weight to higher frequency components.
  • a frequency domain representation with mostly high frequency components will have a relatively high value when the integral is computed.
  • a frequency domain representation with mostly low frequency components will have a relatively low value when the integral is computed.
  • a frequency representation with mostly high frequency components may generate a high measure of randomness while a frequency representation with mostly low frequency components may generate a low measure of randomness.
  • Frequency domain representations with values spread out evenly between high and low frequency components will generate a measure of randomness in the middle of the measurement range.
  • the measure of randomness may then be converted into a score which may then be assigned to the region of the logical address space corresponding to frequency domain representation 500 .
  • a high measure of randomness may be converted to a low score while a low measure of randomness may be converted to a high score.
  • any metadata pages stored in the cache which correspond to a given region may be assigned the score which was generated for the given region.
  • the measures of randomness may be converted to scores using other techniques.
  • the 1 ⁇ 8 matrix 605 includes eight addresses (A1-8) from captured I/O accesses.
  • the eight addresses in matrix 605 may be converted into frequency domain representation 615 using 8 ⁇ 8 DCT matrix 610 .
  • Standard matrix multiplication may be utilized with the eight addresses (A1-A8) multiplied by the first column of matrix 610 to generate the value F1 of matrix 615 , the eight addresses (A1-A8) multiplied by the second column of matrix 610 to generate the value F2, and so on.
  • Matrix 610 includes a zero frequency waveform in the leftmost column and the frequency increases in each column to the right with the highest frequency waveform shown in the rightmost column. Accordingly, frequency domain representation 615 includes eight frequency components (F1-F8), with F1 representing the lowest frequency and F8 representing the highest frequency. Frequency domain representation 615 may be analyzed to determine which frequency components have the highest values and to generate a corresponding randomness measure.
  • Matrix 610 may be adjusted in size to accommodate a larger number of addresses that have been captured in other embodiments. For example, if one thousand addresses have been captured, then matrix 610 may have one thousand rows and eight columns. Additionally, matrix 610 may have more than eight columns in other embodiments, to increase the granularity of frequency components which can be detected in the addresses of matrix 605 .
  • matrix 610 may have 16 columns, 32 columns, 64 columns, or other numbers of columns. It is also noted that the values shown in matrix 610 are merely indicative of one embodiment. Other embodiments may utilize other values within matrix 610 without departing from the spirit of the methods and mechanisms disclosed herein. For example, other DCT matrices may be utilized with other values. Additionally, in other embodiments, other types of transforms besides the DCT may be utilized to generate a frequency domain representation from address matrix 605 .
  • the values within matrix 610 are within the range from ⁇ 1 to 1, in other embodiments, the values may be scaled by a factor into other ranges.
  • a custom matrix may be utilized with custom waveforms in each column corresponding to the waveforms expected to be encountered in the access patterns being serviced by the storage system.
  • Lower frequencies may be utilized in the leftmost columns of matrix 610 with the frequency increasing as the columns move to the right, but the frequencies may differ from the traditional DCT matrix scheme.
  • the leftmost column of the multiplication matrix may have a positive frequency rather than having a frequency of zero as is shown in matrix 610 .
  • only low frequencies may be represented in the multiplication matrix, and the values in the resultant matrix may indicate the presence or absence of low frequencies, while omitting any check for high frequencies.
  • only high frequencies may be represented in the multiplication matrix, and the values in the resultant matrix may indicate the presence or absence of high frequencies, while omitting any check for low frequency components.
  • FIG. 7 one embodiment of a method 700 for assigning priorities to metadata stored in a cache is shown. Any of the storage controllers, caches, and/or other control logic described throughout this specification may generally operate in accordance with method 700 .
  • the steps in this embodiment are shown in sequential order. However, some steps may occur in a different order than shown, some steps may be performed concurrently, some steps may be combined with other steps, and some steps may be absent in another embodiment.
  • An amount of randomness may be measured in a plurality of accesses to a given address space (block 705 ).
  • a frequency domain representation of the addresses of the plurality of accesses may be generated. Then the components of the frequency domain representation may be analyzed to determine if the representation includes mostly high frequency components or mostly low frequency components. If the representation includes mostly high frequency components, then the amount of randomness may have a high value. If the representation includes mostly low frequency components, then the amount of randomness may be measured as having a low value. In other embodiments, other techniques for measuring the amount of randomness in a plurality of accesses to the given address space may be utilized. It is also noted that in one embodiment, the given address space may be an individual region of the total logical address space of a storage system.
  • a relatively high priority may be assigned to metadata associated with the given address space if the measured amount of randomness is relatively low (block 710 ). In one embodiment, the measured amount of randomness may be considered relatively low if the measured amount is less than a first threshold. A relatively low priority may be assigned to the metadata if the measured amount of randomness is relatively high (block 715 ). In one embodiment, the measured amount of randomness may be considered relatively high if the measured amount is greater than a second threshold. In one embodiment, the metadata may be assigned a score based on the assigned priority, and the score may be stored in the cache alongside the metadata.
  • Metadata with a relatively high priority may be preferentially retained in the cache over metadata with a relatively low priority (block 720 ).
  • the cache may utilize a cache replacement algorithm which bases eviction decisions on a variety of factors. For example, in one embodiment, the cache may utilize a least recently used (LRU) algorithm to select a first metadata page to be considered for eviction. After selecting the first metadata page, the cache may check the priority assigned to the metadata page based on the measured amount of randomness. If the selected metadata page has a relatively high priority, then the cache may retain the first metadata page and utilize the LRU algorithm to select a second metadata page to be considered for eviction. The cache may continue selecting metadata pages using the LRU algorithm until a metadata page with a relatively low priority is found.
  • LRU least recently used
  • the cache may utilize other techniques for determining which metadata pages to evict, with these other techniques based at least in part on the priorities assigned in blocks 710 and 715 of method 700 .
  • multiple factors may be combined to generate a total score for each metadata page, with a LRU factor generating a first score, with a randomness measure generating a second score, and so on, with a plurality of scores used to generate the total score.
  • a scaling factor may be applied to each score to scale the individual scores according to a particular formula when generating the total score.
  • Other techniques for using the assigned priority as part of a cache replacement algorithm are possible and are contemplated.
  • method 700 may be performed at various times by a storage controller, processor, cache, and/or other control logic. In some embodiments, method 700 may be performed on a fixed schedule. However, in other embodiments, one or more events may trigger method 700 . These events may include detecting cache thrashing, determining there are processing resources available, determining the traffic being handled by the storage controller is below a threshold, and/or various other events.
  • FIG. 8 one embodiment of a method 800 for measuring the randomness of access patterns to regions of a logical address space is shown.
  • Any of the storage controllers, caches, and/or other control logic described throughout this specification may generally operate in accordance with method 800 .
  • the steps in this embodiment are shown in sequential order. However, some steps may occur in a different order than shown, some steps may be performed concurrently, some steps may be combined with other steps, and some steps may be absent in another embodiment.
  • a plurality of I/O accesses to one or more storage devices of a storage system may be captured over a first period of time (block 805 ).
  • the capturing of the I/O accesses includes storing the logical address of each access. Additional information associated with each I/O access may also be stored in some embodiments. The length of the first period of time may vary depending on the embodiment.
  • the distribution of the accesses to areas within the total logical address space may be analyzed to determine which areas of the logical address space have the highest numbers of accesses (block 810 ). Then, the total logical address space may be partitioned into a plurality of regions based on the distribution analysis (block 815 ).
  • areas with large numbers of I/O accesses may be partitioned into smaller sized regions than areas with small numbers of I/O accesses.
  • the logical address space may be partitioned using a predetermined partitioning pattern, and this partitioning may be performed prior to block 805 .
  • the logical address space may be partitioned into equal, 100 GB sized regions. Other sizes of regions may be utilized in other embodiments.
  • the captured I/O accesses may be stored in lists which correspond to the regions of the logical address space (block 820 ).
  • the logical address space may be partitioned into ten regions, and there may be a list for each of the ten regions.
  • Each I/O access may be stored in the list which corresponds to the region in which the address of the I/O access belongs.
  • only a single list may be maintained, but each I/O access within the list may be tagged with an region identifier (ID) which identifies which region the address of the I/O access falls within.
  • ID region identifier
  • the addresses of the I/O accesses may be converted into a frequency domain representation (block 825 ).
  • the conversion into the frequency domain representation may be performed using a Fourier-related transform.
  • a FFT may be performed on the addresses of the I/O accesses of each region of the logical address space.
  • the number of addresses may not equal a power of two, and so the addresses may be padded with zeroes so that the total number of addresses and zeroes equals a power of two in order to improve the efficiency associated with implementing a FFT.
  • other types of transforms may be used to convert the addresses into a frequency domain representation.
  • the spectral analysis may be performed using any suitable technique.
  • the spectral analysis may involve determining if the corresponding frequency domain representation comprises mostly high frequency components or mostly low frequency components based on the frequency distribution of the frequency domain representation. Accordingly, the total spectral power below a first cutoff frequency may be calculated and compared to a first threshold, and the total spectral power above a second cutoff frequency may be calculated and compared to a second threshold.
  • the peak amplitude within the frequency domain representation may be identified and used to characterize the corresponding region.
  • a score may be generated based on the spectral analysis of the corresponding frequency domain representation (block 835 ).
  • high scores may be given to frequency domain representations with mostly low frequency components and low scores may be given to frequency domain representations with mostly high frequency components.
  • other techniques for generating a score for a region may be utilized.
  • the score may be assigned to metadata stored in the cache based on the score of the region to which the metadata corresponds (block 840 ).
  • the cache replacement algorithm may utilize the generated scores to determine which pages in the cache to replace when new metadata is loaded in the cache (block 845 ).
  • the cache replacement algorithm may attempt to evict first metadata corresponding to one or more first workloads exhibiting high random access patterns while retaining second metadata corresponding to one or more second workloads exhibiting low random access patterns.
  • method 800 may be performed at various times by a storage controller, processor, cache, and/or other control logic. In some embodiments, method 800 may be performed on a fixed schedule. However, in other embodiments, one or more events may trigger method 800 .
  • FIG. 9 one embodiment of a method 900 for prioritizing metadata stored in a cache is shown. Any of the storage controllers, caches, and/or other control logic described throughout this specification may generally operate in accordance with method 900 .
  • the steps in this embodiment are shown in sequential order. However, some steps may occur in a different order than shown, some steps may be performed concurrently, some steps may be combined with other steps, and some steps may be absent in another embodiment.
  • the randomness of each access pattern of a plurality of access patterns targeting one or more storage devices may be measured (block 905 ).
  • the randomness of an access pattern may be measured by capturing a plurality of addresses of a plurality of accesses and then generating a frequency domain representation of the plurality of addresses. Then, a spectral analysis of the frequency domain representation may be performed to determine the randomness of the access pattern. If the spectral analysis determines there are mostly low frequency components in the frequency domain representation, then the access pattern may be identified as a low random access pattern. If the spectral analysis identifies mostly high frequency components in the frequency domain representation, then the access pattern may be identified as high random access pattern. In other embodiments, other techniques for measuring the randomness of the access patterns may be utilized.
  • a first workload may be accessing a database. If a query is run on the database, there may be a pattern of accesses at fixed intervals to the database table. Accordingly, the first workload may be identified as a low random access pattern during the spectral analysis of its frequency domain representation, and then metadata corresponding to the first workload may be retained in the cache. Additionally, metadata corresponding to high random access patterns may be evicted from the cache when new metadata is loaded into the cache (block 915 ).
  • method 900 may be performed at various times by a storage controller, processor, cache, and/or other control logic. In some embodiments, method 900 may be performed on a fixed schedule. However, in other embodiments, one or more events may trigger method 900 .
  • the above-described embodiments may comprise software.
  • the program instructions that implement the methods and/or mechanisms may be conveyed or stored on a non-transitory computer readable medium.
  • non-transitory media which are configured to store program instructions are available and include hard disks, floppy disks, CD-ROM, DVD, flash memory, Programmable ROMs (PROM), random access memory (RAM), and various other forms of volatile or non-volatile storage.
  • resources may be provided over the Internet as services according to one or more various models.
  • models may include Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS).
  • IaaS Infrastructure as a Service
  • PaaS Platform as a Service
  • SaaS Software as a Service
  • IaaS computer infrastructure is delivered as a service.
  • the computing equipment is generally owned and operated by the service provider.
  • software tools and underlying equipment used by developers to develop software solutions may be provided as a service and hosted by the service provider.
  • SaaS typically includes a service provider licensing software as a service on demand. The service provider may host the software, or may deploy the software to a customer for a given period of time. Numerous combinations of the above models are possible and are contemplated.

Abstract

A system and method for efficiently caching metadata in a storage system. Addresses from a plurality of I/O accesses to the storage system are captured and then a frequency domain representation of the addresses is generated. The frequency domain representation is used to measure the randomness of the various applications which are accessing the storage system. Scores are generated based on the measure of randomness, and scores are assigned to the various regions of the logical address space. Scores are then assigned to the metadata pages which are stored in the cache based on the region of the logical address space to which the metadata pages correspond. The scores are used when determining which metadata pages to evict from the cache. The cache will attempt to evict those metadata pages which correspond to regions of the logical address space that are servicing random I/O accesses.

Description

CROSS-REFERENCE TO RELATED APPLICATION
This application is a continuation application of and claims priority from U.S. patent application Ser. No. 14/151,257, filed on Jan. 9, 2014.
BACKGROUND
Field of the Invention
This invention relates to a storage system, and more particularly to caching metadata in a storage system.
Description of the Related Art
Storage systems often store large amounts of data and process a variety of different workloads from various numbers of clients. These storage systems typically have non-volatile storage devices which are used to store client data, and volatile memory to cache metadata used for locating the client data. As the amount of data increases, so does the amount of metadata, and determining which metadata to store in the cache(s) becomes more challenging.
Software applications, such as a logical volume manager or a disk array manager, provide a means of allocating space in storage systems. In addition, a system administrator can create units of storage groups including logical volumes. Storage virtualization provides an abstraction of logical storage from physical storage in order to access logical storage without end-users identifying physical storage. The logical storage may be accessed via a logical address space, with a volume and block number of a given request being used to generate an address within the logical address space.
To support storage virtualization, a volume manager performs input/output (I/O) redirection by translating incoming I/O requests using logical addresses from end-users into new requests using addresses associated with physical locations in the storage devices. As some storage devices include additional address translation mechanisms, such as address translation layers which may be used in solid state storage devices, the translation from a logical address to another address may not be the only or final address translation. Redirection utilizes metadata stored in one or more mapping tables. In addition, information stored in the one or more mapping tables may be used for storage deduplication
For example, in one embodiment, a data storage subsystem may be coupled to a network, and the data storage subsystem may receive read and write requests via the network from one or more client computers. The data storage subsystem may include a plurality of data storage locations on a device group including a plurality of storage devices. The data storage subsystem may also include one or more mapping tables storing a plurality of entries for translating logical addresses of received requests to physical addresses corresponding to data storage locations. Rather than storing the entirety of the mapping table(s) in the device group, portions of the mapping table may be stored in a cache for faster access, allowing some lookups to be performed more efficiently with fewer accesses to the storage devices. The mapping table(s) may be organized into pages, with each page storing a plurality of entries. Portions of the mapping table may be added and evicted from the cache in page size allocation units. In other embodiments, other allocation unit sizes may be chosen.
A typical storage system may process a variety of different types of data workloads. Some of the workloads may have random access patterns while other workloads may have more predictable access patterns. As metadata from these various workloads competes for cache space, it is challenging for the storage system to come up with efficient schemes for choosing which metadata to retain in the cache.
SUMMARY OF THE INVENTION
Various embodiments of systems and methods for caching metadata in a storage system are contemplated.
In one embodiment, the storage system may include a storage controller, a cache, and a plurality of storage devices. The storage controller may be configured to analyze the workloads that are being processed. In one embodiment, the storage controller may determine which workloads have random access patterns and which workloads have predictable access patterns. Also, the storage controller may identify metadata which corresponds to the workloads with random access patterns and identify which metadata corresponds to the workloads with predictable access patterns. The metadata associated with the random workloads may be evicted from the cache while the metadata associated with the predictable workloads may be retained in the cache.
In one embodiment, a plurality of addresses corresponding to a plurality of input/output (I/O) accesses to the storage system may be captured, with the plurality of addresses targeting the logical address space of the storage system. The logical address space may be partitioned into a plurality of regions, and the plurality of addresses may be sorted into a plurality of lists, with one list for each region of the logical address space.
For each list of captured addresses, the list may be transformed into a frequency domain representation to allow for spectral analysis of the frequency components of the access pattern to the corresponding region. In one embodiment, a Fourier-related transform may be utilized to generate the frequency domain representation of each list. In one embodiment, a score may be generated for each region based on the analysis of the corresponding frequency domain representation. A cache replacement algorithm may utilize the generated scores to determine which pages in the cache to replace when new metadata needs to be loaded into the cache. The cache replacement algorithm may attempt to prevent metadata for workloads with random access patterns from kicking out metadata for workloads that have predictable access patterns.
In one embodiment, if a given frequency domain representation indicates the access pattern is a highly random access pattern, then a low score may be given to the corresponding region. Any metadata pages containing address translations for this region may be assigned this low score when these metadata pages are stored in the cache. If a given frequency domain representation indicates the access pattern is a low random access pattern, then a high score may be given to the corresponding region. This high score may be assigned to any metadata pages which have address translations for this region and which are stored in the cache. The cache may retain metadata pages with high scores while evicting metadata pages with low scores.
Low random access patterns tend to correspond to accesses that will retarget the same region of the logical address space for future accesses. Accordingly, metadata pages corresponding to regions with low random access patterns are likely to be reused and the cache may attempt to retain metadata pages with high scores in the cache. Highly random access patterns tend to correspond to accesses that will not come back to the same region of the logical address space for future accesses. Therefore, metadata pages corresponding to regions with high random access patterns are not likely to be reused and the cache may attempt to evict metadata pages with low scores from the cache. In this way, the efficiency of the storage system will be improved by retaining metadata pages in the cache which are likely to be used again, resulting in fewer lookups to the storage devices for metadata.
These and other embodiments will become apparent upon consideration of the following description and accompanying drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a generalized block diagram illustrating one embodiment of a storage system.
FIG. 2 is a block diagram illustrating one embodiment of a mapping table.
FIG. 3 illustrates one embodiment of a storage controller.
FIG. 4 illustrates one embodiment of a listing of captured I/O accesses.
FIG. 5 illustrates one embodiment of a frequency domain representation of an I/O access listing.
FIG. 6 illustrates one embodiment of converting addresses of I/O accesses into a frequency domain representation.
FIG. 7 is a generalized flow diagram illustrating one embodiment of a method for assigning priorities to metadata stored in a cache.
FIG. 8 is a generalized flow diagram illustrating one embodiment of a method for measuring the randomness of access patterns to regions of a logical address space.
FIG. 9 is a generalized flow diagram illustrating one embodiment of a method for prioritizing metadata stored in a cache.
While the invention is susceptible to various modifications and alternative forms, specific embodiments are shown by way of example in the drawings and are herein described in detail. It should be understood, however, that drawings and detailed description thereto are not intended to limit the invention to the particular form disclosed, but on the contrary, the invention is to cover all modifications, equivalents and alternatives falling within the spirit and scope of the present invention as defined by the appended claims.
DETAILED DESCRIPTION
In the following description, numerous specific details are set forth to provide a thorough understanding of the present invention. However, one having ordinary skill in the art should recognize that the various embodiments might be practiced without these specific details. In some instances, well-known circuits, structures, signals, computer program instruction, and techniques have not been shown in detail to avoid obscuring the present invention. It will be appreciated that for simplicity and clarity of illustration, elements shown in the figures have not necessarily been drawn to scale. For example, the dimensions of some of the elements may be exaggerated relative to other elements.
This specification includes references to “one embodiment”. The appearance of the phrase “in one embodiment” in different contexts does not necessarily refer to the same embodiment. Particular features, structures, or characteristics may be combined in any suitable manner consistent with this disclosure. Furthermore, as used throughout this application, the word “may” is used in a permissive sense (i.e., meaning having the potential to), rather than the mandatory sense (i.e., meaning must). Similarly, the words “include”, “including”, and “includes” mean including, but not limited to.
Terminology. The following paragraphs provide definitions and/or context for terms found in this disclosure (including the appended claims):
“Comprising.” This term is open-ended. As used in the appended claims, this term does not foreclose additional structure or steps. Consider a claim that recites: “A system comprising a storage controller . . . .” Such a claim does not foreclose the system from including additional components (e.g., network interface, display device).
“Configured To.” Various units, circuits, or other components may be described or claimed as “configured to” perform a task or tasks. In such contexts, “configured to” is used to connote structure by indicating that the units/circuits/components include structure (e.g., circuitry) that performs the task or tasks during operation. As such, the unit/circuit/component can be said to be configured to perform the task even when the specified unit/circuit/component is not currently operational (e.g., is not on). The units/circuits/components used with the “configured to” language include hardware—for example, circuits, memory storing program instructions executable to implement the operation, etc. Reciting that a unit/circuit/component is “configured to” perform one or more tasks is expressly intended not to invoke 35 U.S.C. §112, sixth paragraph, for that unit/circuit/component. Additionally, “configured to” can include generic structure (e.g., generic circuitry) that is manipulated by software and/or firmware (e.g., an FPGA or a general-purpose processor executing software) to operate in a manner that is capable of performing the task(s) at issue. “Configured to” may also include adapting a manufacturing process (e.g., a semiconductor fabrication facility) to fabricate devices (e.g., integrated circuits) that are adapted to implement or perform one or more tasks.
“First,” “Second,” etc. As used herein, these terms are used as labels for nouns that they precede, and do not imply any type of ordering (e.g., spatial, temporal, logical, etc.). For example, the terms “first” and “second” regions of a logical address space can be used to refer to any two regions.
“Based On.” As used herein, this term is used to describe one or more factors that affect a determination. This term does not foreclose additional factors that may affect a determination. That is, a determination may be solely based on those factors or based, at least in part, on those factors. Consider the phrase “determine A based on B.” While B may be a factor that affects the determination of A, such a phrase does not foreclose the determination of A from also being based on C. In other instances, A may be determined based solely on B.
Referring now to FIG. 1, a generalized block diagram of one embodiment of a storage system 100 is shown. Storage system 100 may include storage controller 110 and storage device groups 130 and 140, which are representative of any number of storage device groups (or data storage arrays). As shown, storage device group 130 includes storage devices 135A-N, which are representative of any number and type of storage devices (e.g., solid-state drives (SSDs)). Storage controller 110 may be coupled directly to client computer system 125, and storage controller 110 may be coupled remotely over network 120 to client computer system 115. Clients 115 and 125 are representative of any number of clients which may utilize storage controller 110 for storing and accessing data in system 100.
Storage controller 110 may include software and/or hardware configured to provide access to storage devices 135A-N. Although storage controller 110 is shown as being separate from storage device groups 130 and 140, in some embodiments, storage controller 110 may be located within one or each of storage device groups 130 and 140. Storage controller 110 may include or be coupled to a base operating system (OS), a volume manager, and additional control logic for implementing the various techniques disclosed herein.
Storage controller 110 may include and/or execute on any number of processors and may include and/or execute on a single host computing device or be spread across multiple host computing devices, depending on the embodiment. In some embodiments, storage controller 110 may generally include or execute on one or more file servers and/or block servers. Storage controller 110 may run any type of OS (e.g., Windows®, Unix®, Linux®, Solaris®, MacOS®) depending on the embodiment.
It is noted that in alternative embodiments, the number and type of clients, storage controllers, networks, storage device groups, and data storage devices is not limited to those shown in FIG. 1. Furthermore, in various embodiments, the methods and mechanisms disclosed herein can be implemented in various networks and systems including computer systems, security systems, wireless networks, network architectures, data centers, operating systems, communication devices, and various other devices and systems.
Network 120 may utilize a variety of techniques including wireless connection, direct local area network (LAN) connections, wide area network (WAN) connections such as the Internet, a router, storage area network, Ethernet, and others. Network 120 may comprise one or more LANs that may also be wireless. Network 120 may further include remote direct memory access (RDMA) hardware and/or software, transmission control protocol/internet protocol (TCP/IP) hardware and/or software, router, repeaters, switches, grids, and/or others. Protocols such as Fibre Channel, Fibre Channel over Ethernet (FCoE), iSCSI, and so forth may be used in network 120. The network 120 may interface with a set of communications protocols used for the Internet such as the Transmission Control Protocol (TCP) and the Internet Protocol (IP), or TCP/IP.
Client computer systems 115 and 125 are representative of any number and type of stationary or mobile computers such as desktop personal computers (PCs), servers, server farms, workstations, laptops, handheld computers, servers, personal digital assistants (PDAs), smart phones, and so forth. Generally speaking, client computer systems 115 and 125 include one or more processors comprising one or more processor cores. Each processor core includes circuitry for executing instructions according to a predefined general-purpose instruction set. For example, the x86 instruction set architecture may be selected. Alternatively, the ARM®, Alpha®, PowerPC®, SPARC®, or any other general-purpose instruction set architecture may be selected. The processor cores may access cache memory subsystems for data and computer program instructions. The cache subsystems may be coupled to a memory hierarchy comprising random access memory (RAM) and a storage device.
Turning now to FIG. 2, a generalized block diagram of one embodiment of a mapping table is shown. One or more mapping tables may be used for I/O redirection or translation, deduplication of duplicate copies of user data, snapshot mappings, and so forth. Mapping tables may be stored in the storage devices 135A-N (of FIG. 1). The diagram shown in FIG. 2 represents a logical representation of one embodiment of the organization and storage of the mapping table. Each level shown may include mapping table entries corresponding to a different period of time. For example, level “1” may include information older than information stored in level “2”. Similarly, level “2” may include information older than information stored in level “3”. The information stored in the records, pages and levels shown in FIG. 2 may be stored in a random-access manner within the storage devices 135A-N. Additionally, copies of portions or all of a given mapping table entries may be stored in a random-access memory (RAM), in buffers within a storage controller, and/or in one or more caches for faster access. In various embodiments, a corresponding index may be included in each level for mappings which are part of the level. Such an index may include an identification of mapping table entries and where they are stored (e.g., an identification of the page) within the level. In other embodiments, the index associated with mapping table entries may be a distinct entity, or entities, which are not logically part of the levels themselves.
Generally speaking, each mapping table comprises a set of rows and columns. A single record may be stored in a mapping table as a row. A record may also be referred to as an entry. In one embodiment, a record stores at least one tuple including a key. Tuples may (or may not) also include data fields including data such as a pointer used to identify or locate data components stored in the storage subsystem. It is noted that in various embodiments, the storage subsystem may include storage devices (e.g., SSDs) which have internal mapping mechanisms. In such embodiments, the pointer in the tuple may not be an actual physical address per se. Rather, the pointer may be a logical address which the storage device maps to a physical location within the device. Over time, this internal mapping between logical address and physical location may change. In other embodiments, records in the mapping table may only contain key fields with no additional associated data fields. Attributes associated with a data component corresponding to a given record may be stored in columns, or fields, in the table. Status information, such as a valid indicator, a data age, a data size, and so forth, may be stored in fields, such as Field0 to FieldN shown in FIG. 2.
A key is an entity in a mapping table that may distinguish one row of data from another row. Each row may also be referred to as an entry or a record. A key may be a single column, or it may consist of a group of columns used to identify a record. In one example, an address translation mapping table may utilize a key comprising a volume identifier (ID), a logical or virtual address, a snapshot ID, a sector number, and so forth. A given received read/write storage access request may identify a particular volume, sector and length. A sector may be a logical block of data stored in a volume. Sectors may have different sizes on different volumes. The address translation mapping table may map a volume in sector-size units.
In one embodiment, a volume identifier (ID) along with a received sector number may be used to access the address translation mapping table. Therefore, in such an embodiment, the key value for accessing the address translation mapping table is the combination of the volume ID and the received sector number. In other embodiments, other values may be used to generate a key value. In one embodiment, the records within the address translation mapping table are sorted by key value.
The address translation mapping table may convey a physical pointer value that indicates a location within the data storage subsystem 170 storing a data component corresponding to the received data storage access request. The key value may be compared to one or more key values stored in the mapping table. In the illustrated example, simpler key values, such as “0”, “2”, “12” and so forth, are shown for ease of illustration. The physical pointer value may be stored in one or more of the fields in a corresponding record.
The physical pointer value may include a segment identifier (ID) and a physical address identifying the location of storage. A segment may be a basic unit of allocation in each of the storage devices 135A-N. A segment may have a redundant array of independent device (RAID) level and a data type. During allocation, a segment may have one or more of the storage devices 135A-N selected for corresponding storage. In one embodiment, a segment may be allocated an equal amount of storage space on each of the one or more selected storage devices of the storage devices 135A-N.
In another example, the mapping table shown in FIG. 2 may be a deduplication table. A deduplication table may utilize a key comprising a hash value determined from a data component associated with a storage access request. The initial steps of a deduplication operation may be performed concurrently with other operations, such as a read/write request, a garbage collection operation, a trim operation, and so forth. For a given write request, the data sent from one of the client computer systems may be a data stream, such as a byte stream. A chunking algorithm may perform the dividing of the data stream into discrete data components which may be referred to as “chunks”. A chunk may be a sub-file content-addressable unit of data. The resulting chunks may then be stored in one of the data storage arrays 120 a-120 b to allow for sharing of the chunks. Such chunks may be stored separately or grouped together in various ways.
In various embodiments, the chunks may be represented by a data structure that allows reconstruction of a larger data component from its chunks (e.g. a particular file may be reconstructed based on one or more smaller chunks of stored data). A corresponding data structure may record its corresponding chunks including an associated calculated hash value, a pointer (physical and/or logical) to its location in a storage device 135A-N, and its length.
A mapping table may comprise one or more levels as shown in FIG. 2. A mapping table may comprise 16 to 64 levels, although mapping tables with other numbers of levels are possible and contemplated. In FIG. 2, three levels labeled Level “1”, Level “2” and Level “N” are shown for ease of illustration. Each level within a mapping table may include one or more partitions. In one embodiment, multiple levels within a mapping table are sorted by time. For example, in FIG. 2, Level “1” may be older than Level “2”. Similarly, Level “2” may be older than Level “N”.
In one embodiment, each partition is a 4 kilo-byte (KB) page. For example, Level “N” is shown to comprise pages 210 a-210 g, Level “2” comprises pages 210 h-210 j and Level “1” comprises pages 210 k-210 n. It is possible and contemplated other partition sizes may also be chosen for each of the levels within a mapping table. In addition, it is possible one or more levels have a single partition, which is the level itself.
Turning now to FIG. 3, a block diagram of one embodiment of a storage controller 300 is shown. Storage controller 300 may include cache 305, metadata frequency analyzer 310, and processor(s) 315. Metadata frequency analyzer 310 may be implemented using any combination of hardware and/or software. It is noted that while metadata frequency analyzer 310 is shown separately from processor(s) 315, portions or the entirety of metadata frequency analyzer 310 may be executed by processor(s) 315. It is noted that storage controller 300 may also include other logic and components (e.g., network interface, RAM) which are not shown in FIG. 3 for ease of illustration. Storage controller 300 may also be coupled to one or more clients (not shown) and one or more storage devices (not shown).
Storage controller 300 may be configured to receive I/O requests targeting one or more storage devices of a storage system. Storage controller 300 may also be configured to process the received I/O requests by storing data at the targeted locations or retrieving data from the targeted locations. In order to locate the targeted locations, storage controller 300 may retrieve metadata corresponding to the logical addresses of the received I/O requests. In one embodiment, the metadata may include mapping table entries and/or index entries, with the mapping table entries including translations from the logical address space to the physical address space corresponding to the storage devices of the storage system.
Storage controller 300 may be configured to reduce the latency of I/O accesses targeting the one or more storage devices of a storage system. One approach for reducing latency is to cache metadata so as to decrease the number of times the external storage devices are accessed. Cache 305 may be configured to store metadata for the various applications being processed by the host storage system. In some embodiments, cache 305 may store both metadata and data. In other embodiments, cache 305 may store only metadata. Cache 305 may have any configuration (e.g., direct mapped or set associative).
While metadata is shown as being stored in cache 305 in page sized units (e.g., metadata page 325A-B), with each page including a plurality of translation entries, it is noted that this is merely for illustrative purposes. In other embodiments, other unit sizes of metadata may be stored in cache 305. For example, in another embodiment, individual translation entries may be allocated in cache 305. The allocation size of metadata stored in cache 305 may also be referred to more generally as a “metadata grain”.
Metadata frequency analyzer 310 may be configured to perform a frequency analysis on the access patterns to the one or more storage devices of the storage system. Metadata frequency analyzer 310 may include any combination of hardware and/or software. In one embodiment, a plurality of received I/O accesses may be captured by storage controller 300 and provided as inputs to metadata frequency analyzer 310. More specifically, the logical addresses of the received I/O accesses may be captured and logged into one or more lists. In one embodiment, each logical address may consist of a volume ID and a logical block address (LBA). The one or more lists may then be transformed from the logical address space domain to the frequency domain. In one embodiment, the logical address space may be treated as though it were the time domain when using a Fourier-related transform to transform the addresses into the frequency domain. For example, each access may be considered to have been received a fixed amount of time subsequent to the previous access. In this embodiment, the actual time the access was made will not be captured, but only the order in which the accesses were made will be retained.
In one embodiment, metadata frequency analyzer 310 may receive as an input the address offsets of the I/O accesses in the logical address space. Then, metadata frequency analyzer 310 may convert these logical address offsets to the frequency domain. In one embodiment, metadata frequency analyzer 310 may use a Fourier transform, such as the discrete Fourier transform, to generate a frequency domain representation of the logical address offsets. In another embodiment, metadata frequency analyzer 310 may use a discrete cosine transform (DCT) to convert the addresses to the frequency domain. Using the DCT, metadata frequency analyzer 310 may convert the sequence of address values into a sum of cosine terms oscillating at different frequencies. In other embodiments, other types of tranforms (e.g., wavelet) may be used to convert the address offsets to the frequency domain.
After the logical address space offsets are converted into the frequency domain, metadata frequency analyzer 310 may perform a spectral analysis of the generated frequency domain components. In one embodiment, if most of the energy in the frequency domain signal is located in the low frequency components, then metadata frequency analyzer 310 may identify these accesses as a predictable, low-random access pattern. If most of the energy in the frequency domain signal is located in the high frequency components, then metadata frequency analyzer 310 may identify these accesses as a highly-random access pattern.
In one embodiment, the logical address space may be partitioned into a plurality of regions. Metadata frequency analyzer 310 may utilize a scoring function formula to generate a score for the various regions of the logical address space. For example, if a first region which is servicing requests corresponds to a low-random access pattern as determined by the spectral analysis, then the first region may be given a high score. Any metadata pages which are stored in cache 305 and which correspond to the first region may be assigned the high score. If a second region corresponds to a highly random access pattern, then the second region may be given a low score. Any metadata pages which are stored in cache 305 and which correspond to the second region may be assigned the low score. As shown in FIG. 3, metadata pages 325A-B have been assigned scores 320A-B, which may correspond to the scores assigned to their corresponding regions in the logical address space.
When cache 305 needs to evict a metadata page, cache 305 may prioritize retaining metadata pages with a high score while attempting to evict metadata pages with a low score. It is noted that the assignment of scores may be reversed in other embodiments, such that highly random access pattern regions may be given a high score and low random access pattern regions may be given a low score. In these embodiments, cache 305 may prioritize retaining metadata pages with a low score while attempting to evict metadata pages with a high score. Any of various scoring functions may be utilized to generate a score for the various regions of the logical address space based on the corresponding frequency domain representations. For example, in one embodiment, an integral of the frequency domain representation may be calculated to generate a score for a given region.
Referring now to FIG. 4, one embodiment of a listing 400 of captured I/O accesses is shown. Listing 400 may include the most recently detected I/O accesses to the storage devices in a storage system (e.g., storage system 100 of FIG. 1). The time period over which listing 400 was captured may vary depending on the embodiment. Also, listing 400 includes a number of accesses ‘N’, wherein ‘N’ is representative of any number of accesses, depending on the embodiment.
In one embodiment, a storage controller may capture I/O accesses over a certain period of time. In another embodiment, the storage controller may start capturing I/O accesses and continue capturing I/O accesses until a certain threshold number of I/O accesses has been reached. The threshold number of accesses may vary depending on the embodiment.
As is shown in listing 400, each logical address of the access may be logged and stored in listing 400. These addresses are shown starting with A1, which is followed by A2, A3, and so on until AN, which represents the logical address of the last captured access. Listing 400 may be treated as though the access number were the x (or horizontal) axis and the logical address were the y (or vertical) axis. Listing 400 may then be converted into a frequency domain representation using any of various transforms (e.g., Discrete Fourier Transform (DFT), DCT, wavelet transform). In one embodiment, the conversion to the frequency domain representation may be performed by assuming the access number is a time measurement and by assuming the logical address is an amplitude. In other words, the columns of listing 400 may be treated as though they were time (or sample number) and amplitude rather than access number and logical address, respectively. Therefore, the conversion to the frequency domain representation is straightforward and may be performed using any of various techniques well known to those skilled in the art.
In some embodiments, listing 400 may be split up into multiple listings, and accesses may be categorized according to the region of the logical address space in which they are located. For example, if a logical address space is 4 gigabytes (GB) in size, then each 1 GB region of the logical address space may have its own listing. Any accesses that fall within the first GB of the logical address space may be stored in a first listing, accesses that fall within addresses 1 GB-2 GB may be stored in a second listing, and so on. In this way, a different frequency domain representation of each region may be generated and a score may be assigned to a metadata page based on the score of the region in which the metadata page is located.
In some embodiments, the regions may all be the same size, as in the example described above with 1 GB size regions. However, in other embodiments, the regions may be different sizes, with some regions larger than other regions. For example, in one embodiment, an address space may be split up into 10 regions, with 6 of the regions equal in size at 1 GB, while 2 of the regions are of size 500 megabytes (MB), and the remaining 2 regions are of size 250 MB.
In one embodiment, the logical address space may be partitioned into regions prior to capturing addresses of I/O accesses. In this embodiment, there may be a listing for each region of the logical address space, and the captured addresses may be stored in the listing corresponding to the region in which they are located. In another embodiment, the logical address space may be partitioned into regions after capturing addresses of I/O accesses. In this embodiment, the addresses may be analyzed prior to partitioning the logical address space into regions to determine how best to perform the partitioning. For example, if a large number of accesses are made to a particular area of the logical address space, then this area may be partitioned into smaller regions as compared to areas of the logical address space with small numbers of accesses. It is noted that other ways of partitioning the logical address space into regions are possible and are contemplated.
In one embodiment, once the regions have been defined, the captured addresses of I/O accesses may be assigned to their appropriate listings. Then, for each listing, the addresses in the logical address space may be converted into a frequency domain representation using any suitable transform (e.g., DFT, fast Fourier transform (FFT), DCT). A spectral analysis may be performed on each frequency domain representation so as to generate a score for the corresponding region in the logical address space. In one embodiment, if the frequency domain representation has mostly low frequency components, then a high score may be generated for the region. If the frequency domain representation has mostly high frequency components, then a low score may be generated for the region. Then, the metadata pages stored in the cache may be scored according to the score of the region to which they correspond. The cache may then utilize this score when determining which metadata pages to evict from the cache. The cache may attempt to evict metadata pages with a low score, corresponding to a region with mostly high frequency components.
Referring now to FIG. 5, one embodiment of a frequency domain representation of an I/O access listing is shown. It should be noted that frequency domain representation 500 is merely one example of a frequency domain representation after the addresses of a listing (e.g., listing 400 of FIG. 4) have been converted into the frequency domain. Other frequency domain representations may have a different distribution of frequency components depending on the types of access patterns used to generate the addresses of the corresponding listings. As shown, frequency domain representation 500 includes mostly high frequency components, which corresponds to a highly random access pattern for the addresses of the accesses in the corresponding listing.
In one embodiment, a series of frequency bins may be used to represent frequency domain representation 500. The frequency bins may divide the total signal spectrum into equally spaced frequency ranges, and the size of each bin may vary according to the embodiment. For example, in one embodiment, each frequency bin (F1, F2, etc.) shown on the horizontal axis may correspond to 1 kilohertz (kHz) of frequency range. The vertical axis may measure the amplitude of the energy in each frequency bin, and the amplitude may be measured using any suitable unit. It is noted that the term “energy” as used in this context is meant to indicate that standard techniques for analyzing and measuring a frequency domain representation may be utilized. However, the term “energy” is not intended to suggest that the original addresses contain energy in the same manner of an electrical signal undergoing a frequency domain transformation. The term “energy” may be defined as the numerical value of the frequency components in the frequency domain transformation.
Accordingly, the frequency component values in the frequency range from 0 to 1 kHz may be calculated and displayed above the frequency bin F1 in FIG. 5. The frequency component values in the frequency range from 1 kHz to 2 kHz is shown above frequency bin F2, the values in the frequency range from 2 kHz to 3 kHz is shown above frequency bin F3, and so on. It is noted that this is merely one example of a way to partition the total frequency range into bins for a particular spectral analysis. In other embodiments, other numbers of frequency bins may be utilized and the frequency bins may correspond to other sizes of frequency ranges. In a further embodiment, rather than combining the values of frequency components over a fixed range into a frequency bin, the frequency domain representation may be analyzed using the discrete frequency components generated by the transformation from the corresponding address listing. In other embodiments, frequency domain representation 500 may be analyzed using other suitable techniques. For example, components at a predetermined percentage or decibel level above the average signal level may be identified, a peak signal amplitude level may be located, and/or any other suitable spectral analysis may be used to identify the type of access pattern which generated spectrum 500.
A measure of randomness may be generated for frequency domain representation 500 based on a spectral analysis of the various frequency bins F1-F5. In one embodiment, the measure of randomness may then be compared to one or more thresholds to determine if the corresponding access pattern is a low random access pattern or high random access pattern. Metadata corresponding to a low random access pattern may be prioritized for retention in a cache while metadata corresponding to a high random access pattern may be evicted from the cache.
In one embodiment, an integral of the frequency components of representation 500 may be computed in order to measure an amount of randomness in the corresponding access pattern, with the integral giving more weight to higher frequency components. Thus, a frequency domain representation with mostly high frequency components will have a relatively high value when the integral is computed. Otherwise, a frequency domain representation with mostly low frequency components will have a relatively low value when the integral is computed. Accordingly, a frequency representation with mostly high frequency components may generate a high measure of randomness while a frequency representation with mostly low frequency components may generate a low measure of randomness. Frequency domain representations with values spread out evenly between high and low frequency components will generate a measure of randomness in the middle of the measurement range.
The measure of randomness may then be converted into a score which may then be assigned to the region of the logical address space corresponding to frequency domain representation 500. In one embodiment, a high measure of randomness may be converted to a low score while a low measure of randomness may be converted to a high score. Then, any metadata pages stored in the cache which correspond to a given region may be assigned the score which was generated for the given region. In other embodiments, the measures of randomness may be converted to scores using other techniques.
Turning now to FIG. 6, one embodiment of converting addresses of I/O accesses into a frequency domain representation is shown. The 1×8 matrix 605 includes eight addresses (A1-8) from captured I/O accesses. The eight addresses in matrix 605 may be converted into frequency domain representation 615 using 8×8 DCT matrix 610. Standard matrix multiplication may be utilized with the eight addresses (A1-A8) multiplied by the first column of matrix 610 to generate the value F1 of matrix 615, the eight addresses (A1-A8) multiplied by the second column of matrix 610 to generate the value F2, and so on.
Matrix 610 includes a zero frequency waveform in the leftmost column and the frequency increases in each column to the right with the highest frequency waveform shown in the rightmost column. Accordingly, frequency domain representation 615 includes eight frequency components (F1-F8), with F1 representing the lowest frequency and F8 representing the highest frequency. Frequency domain representation 615 may be analyzed to determine which frequency components have the highest values and to generate a corresponding randomness measure.
It is noted that the small number of addresses being converted into a frequency domain transformation in FIG. 6 are shown merely for the purposes of illustration. In a typical embodiment, the number of addresses which will be converted into a frequency domain representation will be much greater than eight. However, the eight addresses of matrix 605 shown in FIG. 6 are intended to serve as an example of how a larger number of addresses may be processed. Matrix 610 may be adjusted in size to accommodate a larger number of addresses that have been captured in other embodiments. For example, if one thousand addresses have been captured, then matrix 610 may have one thousand rows and eight columns. Additionally, matrix 610 may have more than eight columns in other embodiments, to increase the granularity of frequency components which can be detected in the addresses of matrix 605. For example, in other embodiments, matrix 610 may have 16 columns, 32 columns, 64 columns, or other numbers of columns. It is also noted that the values shown in matrix 610 are merely indicative of one embodiment. Other embodiments may utilize other values within matrix 610 without departing from the spirit of the methods and mechanisms disclosed herein. For example, other DCT matrices may be utilized with other values. Additionally, in other embodiments, other types of transforms besides the DCT may be utilized to generate a frequency domain representation from address matrix 605.
For example, although the values within matrix 610 are within the range from −1 to 1, in other embodiments, the values may be scaled by a factor into other ranges. Also, in some embodiments, a custom matrix may be utilized with custom waveforms in each column corresponding to the waveforms expected to be encountered in the access patterns being serviced by the storage system. Lower frequencies may be utilized in the leftmost columns of matrix 610 with the frequency increasing as the columns move to the right, but the frequencies may differ from the traditional DCT matrix scheme. For example, in another embodiment, the leftmost column of the multiplication matrix may have a positive frequency rather than having a frequency of zero as is shown in matrix 610. In a further embodiment, only low frequencies may be represented in the multiplication matrix, and the values in the resultant matrix may indicate the presence or absence of low frequencies, while omitting any check for high frequencies. Similarly, in a still further embodiment, only high frequencies may be represented in the multiplication matrix, and the values in the resultant matrix may indicate the presence or absence of high frequencies, while omitting any check for low frequency components. Variations on the above described techniques are possible and are contemplated.
Referring now to FIG. 7, one embodiment of a method 700 for assigning priorities to metadata stored in a cache is shown. Any of the storage controllers, caches, and/or other control logic described throughout this specification may generally operate in accordance with method 700. In addition, the steps in this embodiment are shown in sequential order. However, some steps may occur in a different order than shown, some steps may be performed concurrently, some steps may be combined with other steps, and some steps may be absent in another embodiment.
An amount of randomness may be measured in a plurality of accesses to a given address space (block 705). In one embodiment, a frequency domain representation of the addresses of the plurality of accesses may be generated. Then the components of the frequency domain representation may be analyzed to determine if the representation includes mostly high frequency components or mostly low frequency components. If the representation includes mostly high frequency components, then the amount of randomness may have a high value. If the representation includes mostly low frequency components, then the amount of randomness may be measured as having a low value. In other embodiments, other techniques for measuring the amount of randomness in a plurality of accesses to the given address space may be utilized. It is also noted that in one embodiment, the given address space may be an individual region of the total logical address space of a storage system.
A relatively high priority may be assigned to metadata associated with the given address space if the measured amount of randomness is relatively low (block 710). In one embodiment, the measured amount of randomness may be considered relatively low if the measured amount is less than a first threshold. A relatively low priority may be assigned to the metadata if the measured amount of randomness is relatively high (block 715). In one embodiment, the measured amount of randomness may be considered relatively high if the measured amount is greater than a second threshold. In one embodiment, the metadata may be assigned a score based on the assigned priority, and the score may be stored in the cache alongside the metadata.
Metadata with a relatively high priority may be preferentially retained in the cache over metadata with a relatively low priority (block 720). In one embodiment, the cache may utilize a cache replacement algorithm which bases eviction decisions on a variety of factors. For example, in one embodiment, the cache may utilize a least recently used (LRU) algorithm to select a first metadata page to be considered for eviction. After selecting the first metadata page, the cache may check the priority assigned to the metadata page based on the measured amount of randomness. If the selected metadata page has a relatively high priority, then the cache may retain the first metadata page and utilize the LRU algorithm to select a second metadata page to be considered for eviction. The cache may continue selecting metadata pages using the LRU algorithm until a metadata page with a relatively low priority is found. In other embodiments, the cache may utilize other techniques for determining which metadata pages to evict, with these other techniques based at least in part on the priorities assigned in blocks 710 and 715 of method 700. For example, in another embodiment, multiple factors may be combined to generate a total score for each metadata page, with a LRU factor generating a first score, with a randomness measure generating a second score, and so on, with a plurality of scores used to generate the total score. In some cases, a scaling factor may be applied to each score to scale the individual scores according to a particular formula when generating the total score. Other techniques for using the assigned priority as part of a cache replacement algorithm are possible and are contemplated.
It is noted that method 700 may be performed at various times by a storage controller, processor, cache, and/or other control logic. In some embodiments, method 700 may be performed on a fixed schedule. However, in other embodiments, one or more events may trigger method 700. These events may include detecting cache thrashing, determining there are processing resources available, determining the traffic being handled by the storage controller is below a threshold, and/or various other events.
Referring now to FIG. 8, one embodiment of a method 800 for measuring the randomness of access patterns to regions of a logical address space is shown. Any of the storage controllers, caches, and/or other control logic described throughout this specification may generally operate in accordance with method 800. In addition, the steps in this embodiment are shown in sequential order. However, some steps may occur in a different order than shown, some steps may be performed concurrently, some steps may be combined with other steps, and some steps may be absent in another embodiment.
A plurality of I/O accesses to one or more storage devices of a storage system may be captured over a first period of time (block 805). The capturing of the I/O accesses includes storing the logical address of each access. Additional information associated with each I/O access may also be stored in some embodiments. The length of the first period of time may vary depending on the embodiment. Next, the distribution of the accesses to areas within the total logical address space may be analyzed to determine which areas of the logical address space have the highest numbers of accesses (block 810). Then, the total logical address space may be partitioned into a plurality of regions based on the distribution analysis (block 815). In one embodiment, areas with large numbers of I/O accesses may be partitioned into smaller sized regions than areas with small numbers of I/O accesses. Alternatively, the logical address space may be partitioned using a predetermined partitioning pattern, and this partitioning may be performed prior to block 805. For example, in one embodiment, the logical address space may be partitioned into equal, 100 GB sized regions. Other sizes of regions may be utilized in other embodiments.
Next, the captured I/O accesses may be stored in lists which correspond to the regions of the logical address space (block 820). For example, in one embodiment, the logical address space may be partitioned into ten regions, and there may be a list for each of the ten regions. Each I/O access may be stored in the list which corresponds to the region in which the address of the I/O access belongs. Alternatively, rather than storing the I/O accesses in separate lists, only a single list may be maintained, but each I/O access within the list may be tagged with an region identifier (ID) which identifies which region the address of the I/O access falls within.
Next, for each region of the logical address space, the addresses of the I/O accesses may be converted into a frequency domain representation (block 825). In one embodiment, the conversion into the frequency domain representation may be performed using a Fourier-related transform. For example, in one embodiment, a FFT may be performed on the addresses of the I/O accesses of each region of the logical address space. In some cases, the number of addresses may not equal a power of two, and so the addresses may be padded with zeroes so that the total number of addresses and zeroes equals a power of two in order to improve the efficiency associated with implementing a FFT. In other embodiments, other types of transforms may be used to convert the addresses into a frequency domain representation.
Next, for each region, perform a spectral analysis of the corresponding frequency domain representation (block 830). The spectral analysis may be performed using any suitable technique. For example, the spectral analysis may involve determining if the corresponding frequency domain representation comprises mostly high frequency components or mostly low frequency components based on the frequency distribution of the frequency domain representation. Accordingly, the total spectral power below a first cutoff frequency may be calculated and compared to a first threshold, and the total spectral power above a second cutoff frequency may be calculated and compared to a second threshold. In another embodiment, the peak amplitude within the frequency domain representation may be identified and used to characterize the corresponding region.
Next, for each region, a score may be generated based on the spectral analysis of the corresponding frequency domain representation (block 835). In one embodiment, high scores may be given to frequency domain representations with mostly low frequency components and low scores may be given to frequency domain representations with mostly high frequency components. In other embodiments, other techniques for generating a score for a region may be utilized. Then, the score may be assigned to metadata stored in the cache based on the score of the region to which the metadata corresponds (block 840). The cache replacement algorithm may utilize the generated scores to determine which pages in the cache to replace when new metadata is loaded in the cache (block 845). The cache replacement algorithm may attempt to evict first metadata corresponding to one or more first workloads exhibiting high random access patterns while retaining second metadata corresponding to one or more second workloads exhibiting low random access patterns.
It is noted that method 800 may be performed at various times by a storage controller, processor, cache, and/or other control logic. In some embodiments, method 800 may be performed on a fixed schedule. However, in other embodiments, one or more events may trigger method 800.
Turning now to FIG. 9, one embodiment of a method 900 for prioritizing metadata stored in a cache is shown. Any of the storage controllers, caches, and/or other control logic described throughout this specification may generally operate in accordance with method 900. In addition, the steps in this embodiment are shown in sequential order. However, some steps may occur in a different order than shown, some steps may be performed concurrently, some steps may be combined with other steps, and some steps may be absent in another embodiment.
The randomness of each access pattern of a plurality of access patterns targeting one or more storage devices may be measured (block 905). In one embodiment, the randomness of an access pattern may be measured by capturing a plurality of addresses of a plurality of accesses and then generating a frequency domain representation of the plurality of addresses. Then, a spectral analysis of the frequency domain representation may be performed to determine the randomness of the access pattern. If the spectral analysis determines there are mostly low frequency components in the frequency domain representation, then the access pattern may be identified as a low random access pattern. If the spectral analysis identifies mostly high frequency components in the frequency domain representation, then the access pattern may be identified as high random access pattern. In other embodiments, other techniques for measuring the randomness of the access patterns may be utilized.
Next, metadata corresponding to low random access patterns may be prioritized when determining which metadata to retain in a cache (block 910). For example, in one embodiment, a first workload may be accessing a database. If a query is run on the database, there may be a pattern of accesses at fixed intervals to the database table. Accordingly, the first workload may be identified as a low random access pattern during the spectral analysis of its frequency domain representation, and then metadata corresponding to the first workload may be retained in the cache. Additionally, metadata corresponding to high random access patterns may be evicted from the cache when new metadata is loaded into the cache (block 915).
It is noted that method 900 may be performed at various times by a storage controller, processor, cache, and/or other control logic. In some embodiments, method 900 may be performed on a fixed schedule. However, in other embodiments, one or more events may trigger method 900.
It is noted that the above-described embodiments may comprise software. In such an embodiment, the program instructions that implement the methods and/or mechanisms may be conveyed or stored on a non-transitory computer readable medium. Numerous types of non-transitory media which are configured to store program instructions are available and include hard disks, floppy disks, CD-ROM, DVD, flash memory, Programmable ROMs (PROM), random access memory (RAM), and various other forms of volatile or non-volatile storage.
In various embodiments, one or more portions of the methods and mechanisms described herein may form part of a cloud-computing environment. In such embodiments, resources may be provided over the Internet as services according to one or more various models. Such models may include Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS). In IaaS, computer infrastructure is delivered as a service. In such a case, the computing equipment is generally owned and operated by the service provider. In the PaaS model, software tools and underlying equipment used by developers to develop software solutions may be provided as a service and hosted by the service provider. SaaS typically includes a service provider licensing software as a service on demand. The service provider may host the software, or may deploy the software to a customer for a given period of time. Numerous combinations of the above models are possible and are contemplated.
It should be emphasized that the above-described embodiments are only non-limiting examples of implementations. Numerous variations and modifications will become apparent to those skilled in the art once the above disclosure is fully appreciated. It is intended that the following claims be interpreted to embrace all such variations and modifications.

Claims (20)

What is claimed is:
1. A method comprising:
measuring an amount of randomness within a plurality of addresses that are referenced in a plurality of accesses to one or more storage devices including adding together frequency component values above a first cutoff frequency in a first frequency distribution of a first frequency domain representation of the plurality of addresses of the plurality of accesses; and
caching, in dependence upon the amount of randomness for each of the plurality of addresses, metadata associated with one or more of the plurality of addresses.
2. The method as recited in claim 1, wherein measuring the amount of randomness comprises generating the first frequency domain representation of the plurality of addresses of the plurality of accesses.
3. The method as recited in claim 1, wherein the plurality of accesses target a logical address space.
4. The method as recited in claim 3, wherein measuring the amount of randomness comprises:
capturing the plurality of addresses from the plurality of accesses;
generating the first frequency domain representation of a first plurality of addresses from the captured plurality of addresses, wherein the first plurality of addresses correspond to a first region of the logical address space, and wherein the first frequency domain representation has the first frequency distribution;
identifying the first region as a relatively low random region responsive to determining the frequency component values above the first cutoff frequency are less than a first threshold; and
identifying the first region as a relatively high random region responsive to determining the frequency component values above the first cutoff frequency are greater than a first threshold.
5. The method as recited in claim 4, further comprising:
generating a first score corresponding to the first region, wherein the first score is based on the amount of randomness in the first frequency distribution;
identifying one or more pages of first metadata corresponding to the first region which are stored in the cache;
assigning the first score to each of the one or more pages of first metadata which are stored in the cache; and
utilizing the first score when determining whether to evict the one or more pages of first metadata from the cache.
6. The method as recited in claim 4, further comprising partitioning the logical address space into a plurality of regions.
7. The method as recited in claim 6, further comprising generating a second frequency domain representation of a second plurality of addresses from the captured plurality of addresses, wherein the second plurality of addresses correspond to a second region of the logical address space, and wherein the second frequency domain representation has a second frequency distribution.
8. A system comprising:
one or more storage devices;
a cache; and
a storage controller;
wherein the storage controller is configured to:
measure an amount of randomness within a plurality of addresses that are referenced in a plurality of accesses to one or more storage devices including adding together frequency component values above a first cutoff frequency in a first frequency distribution of a first frequency domain representation of the plurality of addresses of the plurality of accesses; and
cache, in dependence upon the amount of randomness for each of the plurality of addresses, metadata associated with one or more of the plurality of addresses.
9. The system as recited in claim 8, wherein measuring the amount of randomness comprises generating the first frequency domain representation of the plurality of addresses of the plurality of accesses.
10. The system as recited in claim 8, wherein the plurality of accesses target a logical address space.
11. The system as recited in claim 10, wherein measuring the amount of randomness comprises:
capturing the plurality of addresses from the plurality of accesses;
generating the first frequency domain representation of a first plurality of addresses from the captured plurality of addresses, wherein the first plurality of addresses correspond to a first region of the logical address space, and wherein the first frequency domain representation has the first frequency distribution;
identifying the first region as a relatively low random region responsive to determining the frequency component values above the first cutoff frequency are less than a first threshold; and
identifying the first region as a relatively high random region responsive to determining the frequency component values above the first cutoff frequency are greater than a first threshold.
12. The system as recited in claim 11, wherein the storage controller is further configured to generate a first score corresponding to the first region, wherein the first score is based on the amount of randomness in the first frequency distribution, and wherein the cache is further configured to:
identify one or more pages of first metadata corresponding to the first region which are stored in the cache;
assign the first score to each of the one or more pages of first metadata which are stored in the cache; and
utilize the first score when determining whether to evict the one or more pages of first metadata from the cache.
13. The system as recited in claim 11, wherein the storage controller is further configured to partition the logical address space into a plurality of regions.
14. The system as recited in claim 13, wherein the storage controller is further configured to generate a second frequency domain representation of a second plurality of addresses from the captured plurality of addresses, wherein the second plurality of addresses correspond to a second region of the logical address space, and wherein the second frequency domain representation has a second frequency distribution.
15. A non-transitory computer readable storage medium storing program instructions, wherein the program instructions are executable by a processor to:
measure an amount of randomness within a plurality of addresses that are referenced in a plurality of accesses to one or more storage devices including adding together frequency component values above a first cutoff frequency in a first frequency distribution of a first frequency domain representation of the plurality of addresses of the plurality of accesses; and
cache, in dependence upon the amount of randomness for each of the plurality of addresses, metadata associated with one or more of the plurality of addresses.
16. The non-transitory computer readable storage medium as recited in claim 15, wherein measuring the amount of randomness comprises generating a frequency domain representation of the plurality of addresses of the plurality of accesses.
17. The non-transitory computer readable storage medium as recited in claim 15, wherein the plurality of accesses target a logical address space.
18. The non-transitory computer readable storage medium as recited in claim 17, wherein measuring the amount of randomness comprises:
capturing the plurality of addresses from the plurality of accesses;
generating a first frequency domain representation of a first plurality of addresses from the captured plurality of addresses, wherein the first plurality of addresses correspond to a first region of the logical address space, and wherein the first frequency domain representation has the first frequency distribution;
identifying the first region as a relatively low random region responsive to determining the frequency component values above the first cutoff frequency are less than a first threshold; and
identifying the first region as a relatively high random region responsive to determining the frequency component values above the first cutoff frequency are greater than a first threshold.
19. The non-transitory computer readable storage medium as recited in claim 18, wherein the program instructions are further executable by a processor to:
generate a first score corresponding to the first region, wherein the first score is based on the amount of randomness in the first frequency distribution;
identify one or more pages of first metadata corresponding to the first region which are stored in the cache;
assign the first score to each of the one or more pages of first metadata which are stored in the cache; and
utilize the first score when determining whether to evict the one or more pages of first metadata from the cache.
20. The non-transitory computer readable storage medium as recited in claim 18, wherein the program instructions are further executable by a processor to partition the logical address space into a plurality of regions.
US14/939,693 2014-01-09 2015-11-12 Using frequency domain to prioritize storage of metadata in a cache Expired - Fee Related US9804973B1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US14/939,693 US9804973B1 (en) 2014-01-09 2015-11-12 Using frequency domain to prioritize storage of metadata in a cache
US15/682,699 US10191857B1 (en) 2014-01-09 2017-08-22 Machine learning for metadata cache management

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US14/151,257 US9208086B1 (en) 2014-01-09 2014-01-09 Using frequency domain to prioritize storage of metadata in a cache
US14/939,693 US9804973B1 (en) 2014-01-09 2015-11-12 Using frequency domain to prioritize storage of metadata in a cache

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US14/151,257 Continuation US9208086B1 (en) 2014-01-09 2014-01-09 Using frequency domain to prioritize storage of metadata in a cache

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US15/682,699 Continuation US10191857B1 (en) 2014-01-09 2017-08-22 Machine learning for metadata cache management

Publications (1)

Publication Number Publication Date
US9804973B1 true US9804973B1 (en) 2017-10-31

Family

ID=54708302

Family Applications (3)

Application Number Title Priority Date Filing Date
US14/151,257 Active 2034-06-06 US9208086B1 (en) 2014-01-09 2014-01-09 Using frequency domain to prioritize storage of metadata in a cache
US14/939,693 Expired - Fee Related US9804973B1 (en) 2014-01-09 2015-11-12 Using frequency domain to prioritize storage of metadata in a cache
US15/682,699 Expired - Fee Related US10191857B1 (en) 2014-01-09 2017-08-22 Machine learning for metadata cache management

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US14/151,257 Active 2034-06-06 US9208086B1 (en) 2014-01-09 2014-01-09 Using frequency domain to prioritize storage of metadata in a cache

Family Applications After (1)

Application Number Title Priority Date Filing Date
US15/682,699 Expired - Fee Related US10191857B1 (en) 2014-01-09 2017-08-22 Machine learning for metadata cache management

Country Status (1)

Country Link
US (3) US9208086B1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170083447A1 (en) * 2015-09-22 2017-03-23 EMC IP Holding Company LLC Method and apparatus for data storage system
US10191857B1 (en) * 2014-01-09 2019-01-29 Pure Storage, Inc. Machine learning for metadata cache management
US11016909B2 (en) 2019-08-26 2021-05-25 International Business Machines Corporation Cache page retention based on page cost

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10380633B2 (en) * 2015-07-02 2019-08-13 The Nielsen Company (Us), Llc Methods and apparatus to generate corrected online audience measurement data
US10691614B1 (en) * 2016-11-18 2020-06-23 Tibco Software Inc. Adaptive page replacement
US10496542B1 (en) * 2017-04-27 2019-12-03 EMC IP Holding Company LLC Input/output patterns and data pre-fetch
US20200320002A1 (en) * 2019-04-04 2020-10-08 EMC IP Holding Company LLC Intelligently managing data facility caches
US10997080B1 (en) 2020-02-11 2021-05-04 Western Digital Technologies, Inc. Method and system for address table cache management based on correlation metric of first logical address and second logical address, wherein the correlation metric is incremented and decremented based on receive order of the first logical address and the second logical address
US11455110B1 (en) * 2021-09-08 2022-09-27 International Business Machines Corporation Data deduplication
CN114338398A (en) * 2021-12-30 2022-04-12 北京市商汤科技开发有限公司 Data transmission method and device, electronic equipment and storage medium

Citations (146)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5208813A (en) 1990-10-23 1993-05-04 Array Technology Corporation On-line reconstruction of a failed redundant array system
WO1995002349A1 (en) 1993-07-15 1995-01-26 Paul Hettich Gmbh & Co. Locking device for drawers and the like
US5403639A (en) 1992-09-02 1995-04-04 Storage Technology Corporation File server having snapshot application data groups
US5608890A (en) * 1992-07-02 1997-03-04 International Business Machines Corporation Data set level cache optimization
WO1999013403A1 (en) 1997-09-09 1999-03-18 Storage Technology Corporation Data file storage management system for snapshot copy operations
US5940838A (en) * 1997-07-11 1999-08-17 International Business Machines Corporation Parallel file system and method anticipating cache usage patterns
US5954820A (en) 1997-09-26 1999-09-21 International Business Machines Corporation Portable computer with adaptive demand-driven power management
US6263350B1 (en) 1996-10-11 2001-07-17 Sun Microsystems, Inc. Method and system for leasing storage
US20020038436A1 (en) 2000-09-28 2002-03-28 Nec Corporation Disk array apparatus, error control method for the same apparatus, and control program for the same method
US6412045B1 (en) 1995-05-23 2002-06-25 Lsi Logic Corporation Method for transferring data from a host computer to a storage media using selectable caching strategies
US20020087544A1 (en) 2000-06-20 2002-07-04 Selkirk Stephen S. Dynamically changeable virtual mapping scheme
US20020178335A1 (en) 2000-06-19 2002-11-28 Storage Technology Corporation Apparatus and method for dynamically changeable virtual mapping scheme
US20030140209A1 (en) 2001-12-10 2003-07-24 Richard Testardi Fast path caching
US20040049572A1 (en) 2002-09-06 2004-03-11 Hitachi, Ltd. Event notification in storage networks
US6718448B1 (en) 2000-11-28 2004-04-06 Emc Corporation Queued locking of a shared resource using multimodal lock types
US6748487B1 (en) * 1998-02-04 2004-06-08 Hitachi, Ltd. Disk cache control method, disk array system, and storage system
US6757769B1 (en) 2000-11-28 2004-06-29 Emc Corporation Cooperative lock override procedure
US6799283B1 (en) 1998-12-04 2004-09-28 Matsushita Electric Industrial Co., Ltd. Disk array device
US6834298B1 (en) 1999-09-21 2004-12-21 Siemens Information And Communication Networks, Inc. System and method for network auto-discovery and configuration
US6850938B1 (en) 2001-02-08 2005-02-01 Cisco Technology, Inc. Method and apparatus providing optimistic locking of shared computer resources
US20050066095A1 (en) 2003-09-23 2005-03-24 Sachin Mullick Multi-threaded write interface and methods for increasing the single file read and write throughput of a file server
US6915434B1 (en) 1998-12-18 2005-07-05 Fujitsu Limited Electronic data storage apparatus with key management function and electronic data storage method
US6944711B2 (en) * 2003-03-28 2005-09-13 Hitachi, Ltd. Cache management method for storage device
US20050216535A1 (en) 2004-03-29 2005-09-29 Nobuyuki Saika Backup method, storage system, and program for backup
US6952712B2 (en) * 2001-11-30 2005-10-04 Ntt Docomo, Inc. Method and apparatus for distributing content data over a network
US20050223154A1 (en) 2004-04-02 2005-10-06 Hitachi Global Storage Technologies Netherlands B.V. Method for controlling disk drive
US6973549B1 (en) 2001-12-10 2005-12-06 Incipient, Inc. Locking technique for control and synchronization
US20060074940A1 (en) 2004-10-05 2006-04-06 International Business Machines Corporation Dynamic management of node clusters to enable data sharing
US7028218B2 (en) 2002-12-02 2006-04-11 Emc Corporation Redundant multi-processor and logical processor configuration for a file server
US7028216B2 (en) 2003-11-26 2006-04-11 Hitachi, Ltd. Disk array system and a method of avoiding failure of the disk array system
US7039827B2 (en) 2001-02-13 2006-05-02 Network Appliance, Inc. Failover processing in a storage system
US20060136365A1 (en) 2004-04-26 2006-06-22 Storewiz Inc. Method and system for compression of data for block mode access storage
US20060155946A1 (en) 2005-01-10 2006-07-13 Minwen Ji Method for taking snapshots of data
US20070067585A1 (en) 2005-09-21 2007-03-22 Naoto Ueda Snapshot maintenance apparatus and method
JP2007094472A (en) 2005-09-27 2007-04-12 Hitachi Ltd Snapshot management device and method and storage system
US7216164B1 (en) 2002-10-09 2007-05-08 Cisco Technology, Inc. Methods and apparatus for determining the performance of a server
US20070162954A1 (en) 2003-04-07 2007-07-12 Pela Peter L Network security system based on physical location
US20070174673A1 (en) 2006-01-25 2007-07-26 Tomohiro Kawaguchi Storage system and data restoration method thereof
US20070171562A1 (en) 2006-01-25 2007-07-26 Fujitsu Limited Disk array apparatus and disk-array control method
US20070220313A1 (en) 2006-03-03 2007-09-20 Hitachi, Ltd. Storage control device and data recovery method for storage control device
US20070245090A1 (en) 2006-03-24 2007-10-18 Chris King Methods and Systems for Caching Content at Multiple Levels
US20070266179A1 (en) 2006-05-11 2007-11-15 Emulex Communications Corporation Intelligent network processor and method of using intelligent network processor
US20080059699A1 (en) 2006-09-06 2008-03-06 International Business Machines Corporation System and method of mirrored raid array write management
US20080065852A1 (en) 2006-09-08 2008-03-13 Derick Guy Moore Identification of Uncommitted Memory Blocks During an Initialization Procedure
US20080134174A1 (en) 2006-12-05 2008-06-05 Microsoft Corporation Reduction of operational costs of virtual TLBs
US20080155191A1 (en) 2006-12-21 2008-06-26 Anderson Robert J Systems and methods for providing heterogeneous storage systems
US20080178040A1 (en) 2005-05-19 2008-07-24 Fujitsu Limited Disk failure restoration method and disk array apparatus
US20080209096A1 (en) 2006-08-10 2008-08-28 Lin Robert H C Structure for initializing expansion adpaters installed in a computer system having similar expansion adapters
WO2008102347A1 (en) 2007-02-25 2008-08-28 Sandisk Il Ltd. Interruptible cache flushing in flash memory systems
US20080244205A1 (en) 2007-03-30 2008-10-02 Hitachi, Ltd. And Hitachi Computer Peripherals Co., Ltd. Storage system and storage control method
US20080275928A1 (en) 2007-04-27 2008-11-06 Gary Stephen Shuster Flexible data storage system
US20080285083A1 (en) 2007-03-30 2008-11-20 Brother Kogyo Kabushiki Kaisha Image-processing device
US20080307270A1 (en) 2007-06-07 2008-12-11 Micron Technology, Inc. Emerging bad block detection
US20090006587A1 (en) 2005-05-19 2009-01-01 Wyse Technology Inc. Method and system for thin client configuration
US20090037662A1 (en) * 2007-07-30 2009-02-05 Lee Charles La Frese Method for Selectively Enabling and Disabling Read Caching in a Storage Subsystem
US20090204858A1 (en) 2006-11-27 2009-08-13 Fujitsu Limited Server management program, system, and method, and mail server management program
US20090228648A1 (en) 2008-03-04 2009-09-10 International Business Machines Corporation High performance disk array rebuild
US20090248863A1 (en) 2008-03-28 2009-10-01 Fujitsu Limited Analysis apparatus, analysis method and recording medium for recording analysis program
US20090300280A1 (en) 2008-06-02 2009-12-03 Curtis Edward Jutzi Detecting data mining processes to increase caching efficiency
US20090300084A1 (en) 2008-05-29 2009-12-03 Red Hat, Inc. Set partitioning for encoding file system allocation metadata
US20100057673A1 (en) 2008-09-04 2010-03-04 Boris Savov Reusable mapping rules for data to data transformation
US20100058026A1 (en) 2008-08-27 2010-03-04 International Business Machines Corporation Loading entries into a tlb in hardware via indirect tlb entries
US20100067706A1 (en) 2007-05-30 2010-03-18 Fujitsu Limited Image encrypting device, image decrypting device and method
US20100077205A1 (en) 2008-09-19 2010-03-25 Ekstrom Joseph J System and Method for Cipher E-Mail Protection
US20100082879A1 (en) 2008-09-26 2010-04-01 Mckean Brian D Priority command queues for low latency solid state drives
US20100106905A1 (en) 2008-10-29 2010-04-29 Kabushiki Kaisha Toshiba Disk array control device and storage device
US20100153620A1 (en) 2008-12-17 2010-06-17 Mckean Brian Storage system snapshot assisted by SSD technology
US20100153641A1 (en) 2008-12-12 2010-06-17 Divya Jagadish Hierarchical storage management (hsm) for redundant array of independent disks (raid)
WO2010071655A1 (en) 2008-12-19 2010-06-24 Hewlett-Packard Development Company, L.P. Redundant data storage for uniform read latency
US20100191897A1 (en) 2009-01-23 2010-07-29 Seagate Technology Llc System and method for wear leveling in a data storage device
US7783682B1 (en) 2003-06-30 2010-08-24 Emc Corporation Probabilistic summary data structure based encoding for garbage collection in backup systems
JP2010211681A (en) 2009-03-12 2010-09-24 Toshiba Corp Storage device and virtual device
US20100250802A1 (en) 2009-03-26 2010-09-30 Arm Limited Data processing apparatus and method for performing hazard detection
US20100250882A1 (en) 2009-03-30 2010-09-30 International Business Machines Corporation Incremental backup of source to target storage volume
US20100281225A1 (en) 2009-04-30 2010-11-04 Inventec Corporation Data processing apparatus of basic input/output system
US20100287327A1 (en) 2009-05-06 2010-11-11 Via Telecom, Inc. Computing systems and methods for managing flash memory device
US7873619B1 (en) 2008-03-31 2011-01-18 Emc Corporation Managing metadata
US7913300B1 (en) 2005-04-08 2011-03-22 Netapp, Inc. Centralized role-based access control for storage servers
US20110072300A1 (en) 2009-09-21 2011-03-24 Stmicroelectronics (Rousset) Sas Tearing-proof method for writing data in a nonvolatile memory
US7933936B2 (en) 2005-06-10 2011-04-26 Network Appliance, Inc. Method and system for automatic management of storage space
US20110145598A1 (en) 2009-12-16 2011-06-16 Smith Ned M Providing Integrity Verification And Attestation In A Hidden Execution Environment
US20110161559A1 (en) 2009-12-31 2011-06-30 Yurzola Damian P Physical compression of data with flat or systematic pattern
US20110167221A1 (en) 2010-01-06 2011-07-07 Gururaj Pangal System and method for efficiently creating off-site data volume back-ups
US7979613B2 (en) 2005-07-15 2011-07-12 International Business Machines Corporation Performance of a storage system
US20110238634A1 (en) 2010-03-24 2011-09-29 Makoto Kobara Storage apparatus which eliminates duplicated data in cooperation with host apparatus, storage system with the storage apparatus, and deduplication method for the system
US8086652B1 (en) 2007-04-27 2011-12-27 Netapp, Inc. Storage system-based hole punching for reclaiming unused space from a data container
US20120023375A1 (en) 2010-07-23 2012-01-26 Salesforce.Com, Inc. Generating performance alerts
US20120036309A1 (en) 2010-08-05 2012-02-09 Ut-Battelle, Llc Coordinated garbage collection for raid array of solid state disks
US8117464B1 (en) 2008-04-30 2012-02-14 Netapp, Inc. Sub-volume level security for deduplicated data
US20120117029A1 (en) 2010-11-08 2012-05-10 Stephen Gold Backup policies for using different storage tiers
US20120137075A1 (en) 2009-06-09 2012-05-31 Hyperion Core, Inc. System and Method for a Cache in a Multi-Core Processor
US8205065B2 (en) 2009-03-30 2012-06-19 Exar Corporation System and method for data deduplication
US20120198175A1 (en) 2011-01-31 2012-08-02 Fusion-Io, Inc. Apparatus, system, and method for managing eviction of data
US20120330954A1 (en) 2011-06-27 2012-12-27 Swaminathan Sivasubramanian System And Method For Implementing A Scalable Data Storage Service
US8352540B2 (en) 2008-03-06 2013-01-08 International Business Machines Corporation Distinguishing data streams to enhance data storage efficiency
US20130042052A1 (en) 2011-08-11 2013-02-14 John Colgrove Logical sector mapping in a flash storage array
US20130046995A1 (en) 2010-02-23 2013-02-21 David Movshovitz Method and computer program product for order preserving symbol based encryption
US20130047029A1 (en) 2011-08-17 2013-02-21 Fujitsu Limited Storage system, storage control apparatus, and storage control method
US20130091102A1 (en) 2011-10-11 2013-04-11 Netapp, Inc. Deduplication aware scheduling of requests to access data blocks
US20130205110A1 (en) 2012-02-06 2013-08-08 Doron Kettner Storage Device and Method for Selective Data Compression
US20130227236A1 (en) 2011-03-18 2013-08-29 Fusion-Io, Inc. Systems and methods for storage allocation
US8527544B1 (en) 2011-08-11 2013-09-03 Pure Storage Inc. Garbage collection in a storage system
US8560747B1 (en) 2007-02-16 2013-10-15 Vmware, Inc. Associating heartbeat data with access to shared resources of a computer system
US20130275656A1 (en) 2012-04-17 2013-10-17 Fusion-Io, Inc. Apparatus, system, and method for key-value pool identifier encoding
US20130275391A1 (en) 2012-04-17 2013-10-17 Fusion-Io, Inc. Data Expiry in a Non-Volatile Device
CN103370685A (en) 2010-09-15 2013-10-23 净睿存储股份有限公司 Scheduling of I/O writes in a storage environment
CN103370686A (en) 2010-09-15 2013-10-23 净睿存储股份有限公司 Scheduling of reconstructive I/O read operations in a storage environment
US20130283058A1 (en) 2012-04-23 2013-10-24 International Business Machines Corporation Preserving redundancy in data deduplication systems by encryption
US20130290648A1 (en) 2012-04-27 2013-10-31 Netapp, Inc. Efficient data object storage and retrieval
US20130318314A1 (en) 2012-05-25 2013-11-28 Red Hat, Inc. Managing copies of data on multiple nodes using a data controller node to avoid transaction deadlock
US20130339303A1 (en) 2012-06-18 2013-12-19 Actifio, Inc. System and method for incrementally backing up out-of-band data
US8621241B1 (en) 2008-04-25 2013-12-31 Netapp, Inc. Storage and recovery of cryptographic key identifiers
US20140052946A1 (en) 2012-08-17 2014-02-20 Jeffrey S. Kimmel Techniques for opportunistic data storage
US20140068791A1 (en) 2012-08-31 2014-03-06 Cleversafe, Inc. Securely storing data in a dispersed storage network
US20140089730A1 (en) 2012-09-26 2014-03-27 Fujitsu Limited Storage control device and method for controlling storages
US20140101361A1 (en) 2012-10-08 2014-04-10 International Business Machines Corporation System supporting multiple partitions with differing translation formats
US8700875B1 (en) 2011-09-20 2014-04-15 Netapp, Inc. Cluster view for storage devices
US20140143517A1 (en) 2012-11-19 2014-05-22 Hitachi, Ltd. Storage system
US8751463B1 (en) 2011-06-30 2014-06-10 Emc Corporation Capacity forecasting for a deduplicating storage system
US20140172929A1 (en) 2012-12-14 2014-06-19 Jeffrey C. Sedayao Adaptive data striping and replication across multiple storage clouds for high availability and performance
US20140201150A1 (en) 2013-01-11 2014-07-17 Commvault Systems, Inc. Single snapshot for multiple agents
US20140215129A1 (en) 2013-01-28 2014-07-31 Radian Memory Systems, LLC Cooperative flash memory control
US8806160B2 (en) 2011-08-16 2014-08-12 Pure Storage, Inc. Mapping in a storage system
US20140229131A1 (en) 2012-05-04 2014-08-14 Lsi Corporation Retention-drift-history-based non-volatile memory read threshold optimization
US20140229452A1 (en) 2011-10-06 2014-08-14 Hitachi, Ltd. Stored data deduplication method, stored data deduplication apparatus, and deduplication program
US20140281308A1 (en) 2013-03-15 2014-09-18 Bracket Computing, Inc. Storage unit selection for virtualized storage units
US20140297989A1 (en) 2013-03-27 2014-10-02 Canon Kabushiki Kaisha Information processing apparatus and memory control method
US8874850B1 (en) 2012-05-10 2014-10-28 Netapp, Inc. Hierarchically tagged cache
US20140325115A1 (en) 2013-04-25 2014-10-30 Fusion-Io, Inc. Conditional Iteration for a Non-Volatile Device
US8959305B1 (en) 2012-06-29 2015-02-17 Emc Corporation Space reclamation with virtually provisioned devices
US8965819B2 (en) * 2010-08-16 2015-02-24 Oracle International Corporation System and method for effective caching using neural networks
US20150088805A1 (en) 2013-09-20 2015-03-26 Oracle International Corporation Automatic caching of scan and random access data in computing systems
US20150234709A1 (en) 2014-02-20 2015-08-20 Fujitsu Limited Storage controller, storage system, and control method
US20150244775A1 (en) 2014-02-27 2015-08-27 Commvault Systems, Inc. Work flow management for an information management system
US20150278534A1 (en) 2014-03-26 2015-10-01 Amazon Technologies, Inc. Electronic communication with secure screen sharing of sensitive information
US20160019114A1 (en) 2014-07-15 2016-01-21 Nimble Storage, Inc. Methods and systems for storing data in a redundant manner on a plurality of storage units of a storage system
US20160098199A1 (en) 2014-10-07 2016-04-07 Pure Storage, Inc. Utilizing unmapped and unknown states in a replicated storage system
US20160098191A1 (en) 2014-10-07 2016-04-07 Pure Storage, Inc. Optimizing replication by distinguishing user and system write activity
US9436720B2 (en) 2013-01-10 2016-09-06 Pure Storage, Inc. Safety for volume operations
EP3066610A1 (en) 2013-11-06 2016-09-14 Pure Storage, Inc. Data protection in a storage system using external secrets
EP3082047A1 (en) 2010-09-28 2016-10-19 Pure Storage, Inc. Adaptive raid for an ssd environment
CN104025010B (en) 2011-09-30 2016-11-16 净睿存储股份有限公司 Variable length code in storage system
US9513820B1 (en) 2014-04-07 2016-12-06 Pure Storage, Inc. Dynamically controlling temporary compromise on data redundancy
US9516016B2 (en) 2013-11-11 2016-12-06 Pure Storage, Inc. Storage array password management
US9552248B2 (en) 2014-12-11 2017-01-24 Pure Storage, Inc. Cloud alert to replica
EP3120235A1 (en) 2014-03-20 2017-01-25 Pure Storage, Inc. Remote replication using mediums

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7424577B2 (en) * 2005-08-26 2008-09-09 Network Appliance, Inc. Dynamic optimization of cache memory
US9104599B2 (en) * 2007-12-06 2015-08-11 Intelligent Intellectual Property Holdings 2 Llc Apparatus, system, and method for destaging cached data
US9519540B2 (en) * 2007-12-06 2016-12-13 Sandisk Technologies Llc Apparatus, system, and method for destaging cached data
US20100199036A1 (en) * 2009-02-02 2010-08-05 Atrato, Inc. Systems and methods for block-level management of tiered storage
WO2011031796A2 (en) * 2009-09-08 2011-03-17 Fusion-Io, Inc. Apparatus, system, and method for caching data on a solid-state storage device
US8712984B2 (en) * 2010-03-04 2014-04-29 Microsoft Corporation Buffer pool extension for database server
WO2012083308A2 (en) * 2010-12-17 2012-06-21 Fusion-Io, Inc. Apparatus, system, and method for persistent data management on a non-volatile storage media
US9141527B2 (en) * 2011-02-25 2015-09-22 Intelligent Intellectual Property Holdings 2 Llc Managing cache pools
US9251086B2 (en) * 2012-01-24 2016-02-02 SanDisk Technologies, Inc. Apparatus, system, and method for managing a cache
US9836413B2 (en) * 2013-04-03 2017-12-05 International Business Machines Corporation Maintaining cache consistency in a cache for cache eviction policies supporting dependencies
US9208086B1 (en) * 2014-01-09 2015-12-08 Pure Storage, Inc. Using frequency domain to prioritize storage of metadata in a cache

Patent Citations (152)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5208813A (en) 1990-10-23 1993-05-04 Array Technology Corporation On-line reconstruction of a failed redundant array system
US5608890A (en) * 1992-07-02 1997-03-04 International Business Machines Corporation Data set level cache optimization
US5403639A (en) 1992-09-02 1995-04-04 Storage Technology Corporation File server having snapshot application data groups
WO1995002349A1 (en) 1993-07-15 1995-01-26 Paul Hettich Gmbh & Co. Locking device for drawers and the like
US6412045B1 (en) 1995-05-23 2002-06-25 Lsi Logic Corporation Method for transferring data from a host computer to a storage media using selectable caching strategies
US6263350B1 (en) 1996-10-11 2001-07-17 Sun Microsystems, Inc. Method and system for leasing storage
US5940838A (en) * 1997-07-11 1999-08-17 International Business Machines Corporation Parallel file system and method anticipating cache usage patterns
WO1999013403A1 (en) 1997-09-09 1999-03-18 Storage Technology Corporation Data file storage management system for snapshot copy operations
US5954820A (en) 1997-09-26 1999-09-21 International Business Machines Corporation Portable computer with adaptive demand-driven power management
US6748487B1 (en) * 1998-02-04 2004-06-08 Hitachi, Ltd. Disk cache control method, disk array system, and storage system
US6799283B1 (en) 1998-12-04 2004-09-28 Matsushita Electric Industrial Co., Ltd. Disk array device
US6915434B1 (en) 1998-12-18 2005-07-05 Fujitsu Limited Electronic data storage apparatus with key management function and electronic data storage method
US6834298B1 (en) 1999-09-21 2004-12-21 Siemens Information And Communication Networks, Inc. System and method for network auto-discovery and configuration
US20020178335A1 (en) 2000-06-19 2002-11-28 Storage Technology Corporation Apparatus and method for dynamically changeable virtual mapping scheme
US20020087544A1 (en) 2000-06-20 2002-07-04 Selkirk Stephen S. Dynamically changeable virtual mapping scheme
US20020038436A1 (en) 2000-09-28 2002-03-28 Nec Corporation Disk array apparatus, error control method for the same apparatus, and control program for the same method
US6718448B1 (en) 2000-11-28 2004-04-06 Emc Corporation Queued locking of a shared resource using multimodal lock types
US6757769B1 (en) 2000-11-28 2004-06-29 Emc Corporation Cooperative lock override procedure
US6850938B1 (en) 2001-02-08 2005-02-01 Cisco Technology, Inc. Method and apparatus providing optimistic locking of shared computer resources
US7039827B2 (en) 2001-02-13 2006-05-02 Network Appliance, Inc. Failover processing in a storage system
US6952712B2 (en) * 2001-11-30 2005-10-04 Ntt Docomo, Inc. Method and apparatus for distributing content data over a network
US20030140209A1 (en) 2001-12-10 2003-07-24 Richard Testardi Fast path caching
US6973549B1 (en) 2001-12-10 2005-12-06 Incipient, Inc. Locking technique for control and synchronization
US20040049572A1 (en) 2002-09-06 2004-03-11 Hitachi, Ltd. Event notification in storage networks
US7216164B1 (en) 2002-10-09 2007-05-08 Cisco Technology, Inc. Methods and apparatus for determining the performance of a server
US7028218B2 (en) 2002-12-02 2006-04-11 Emc Corporation Redundant multi-processor and logical processor configuration for a file server
US6944711B2 (en) * 2003-03-28 2005-09-13 Hitachi, Ltd. Cache management method for storage device
US20070162954A1 (en) 2003-04-07 2007-07-12 Pela Peter L Network security system based on physical location
US7783682B1 (en) 2003-06-30 2010-08-24 Emc Corporation Probabilistic summary data structure based encoding for garbage collection in backup systems
US20050066095A1 (en) 2003-09-23 2005-03-24 Sachin Mullick Multi-threaded write interface and methods for increasing the single file read and write throughput of a file server
US7028216B2 (en) 2003-11-26 2006-04-11 Hitachi, Ltd. Disk array system and a method of avoiding failure of the disk array system
US20050216535A1 (en) 2004-03-29 2005-09-29 Nobuyuki Saika Backup method, storage system, and program for backup
US20050223154A1 (en) 2004-04-02 2005-10-06 Hitachi Global Storage Technologies Netherlands B.V. Method for controlling disk drive
US20060136365A1 (en) 2004-04-26 2006-06-22 Storewiz Inc. Method and system for compression of data for block mode access storage
US20060074940A1 (en) 2004-10-05 2006-04-06 International Business Machines Corporation Dynamic management of node clusters to enable data sharing
US20060155946A1 (en) 2005-01-10 2006-07-13 Minwen Ji Method for taking snapshots of data
US7913300B1 (en) 2005-04-08 2011-03-22 Netapp, Inc. Centralized role-based access control for storage servers
US20080178040A1 (en) 2005-05-19 2008-07-24 Fujitsu Limited Disk failure restoration method and disk array apparatus
US20090006587A1 (en) 2005-05-19 2009-01-01 Wyse Technology Inc. Method and system for thin client configuration
US7933936B2 (en) 2005-06-10 2011-04-26 Network Appliance, Inc. Method and system for automatic management of storage space
US7979613B2 (en) 2005-07-15 2011-07-12 International Business Machines Corporation Performance of a storage system
JP2007087036A (en) 2005-09-21 2007-04-05 Hitachi Ltd Snapshot maintenance device and method
US20070067585A1 (en) 2005-09-21 2007-03-22 Naoto Ueda Snapshot maintenance apparatus and method
JP2007094472A (en) 2005-09-27 2007-04-12 Hitachi Ltd Snapshot management device and method and storage system
US20070171562A1 (en) 2006-01-25 2007-07-26 Fujitsu Limited Disk array apparatus and disk-array control method
US20070174673A1 (en) 2006-01-25 2007-07-26 Tomohiro Kawaguchi Storage system and data restoration method thereof
US20070220313A1 (en) 2006-03-03 2007-09-20 Hitachi, Ltd. Storage control device and data recovery method for storage control device
US20070245090A1 (en) 2006-03-24 2007-10-18 Chris King Methods and Systems for Caching Content at Multiple Levels
US20070266179A1 (en) 2006-05-11 2007-11-15 Emulex Communications Corporation Intelligent network processor and method of using intelligent network processor
US20080209096A1 (en) 2006-08-10 2008-08-28 Lin Robert H C Structure for initializing expansion adpaters installed in a computer system having similar expansion adapters
US20080059699A1 (en) 2006-09-06 2008-03-06 International Business Machines Corporation System and method of mirrored raid array write management
US20080065852A1 (en) 2006-09-08 2008-03-13 Derick Guy Moore Identification of Uncommitted Memory Blocks During an Initialization Procedure
US20090204858A1 (en) 2006-11-27 2009-08-13 Fujitsu Limited Server management program, system, and method, and mail server management program
US20080134174A1 (en) 2006-12-05 2008-06-05 Microsoft Corporation Reduction of operational costs of virtual TLBs
US20080155191A1 (en) 2006-12-21 2008-06-26 Anderson Robert J Systems and methods for providing heterogeneous storage systems
US8560747B1 (en) 2007-02-16 2013-10-15 Vmware, Inc. Associating heartbeat data with access to shared resources of a computer system
WO2008102347A1 (en) 2007-02-25 2008-08-28 Sandisk Il Ltd. Interruptible cache flushing in flash memory systems
US20080285083A1 (en) 2007-03-30 2008-11-20 Brother Kogyo Kabushiki Kaisha Image-processing device
JP2008250667A (en) 2007-03-30 2008-10-16 Hitachi Ltd Storage system and storage control method
US20080244205A1 (en) 2007-03-30 2008-10-02 Hitachi, Ltd. And Hitachi Computer Peripherals Co., Ltd. Storage system and storage control method
US8086652B1 (en) 2007-04-27 2011-12-27 Netapp, Inc. Storage system-based hole punching for reclaiming unused space from a data container
US20080275928A1 (en) 2007-04-27 2008-11-06 Gary Stephen Shuster Flexible data storage system
US20100067706A1 (en) 2007-05-30 2010-03-18 Fujitsu Limited Image encrypting device, image decrypting device and method
US20080307270A1 (en) 2007-06-07 2008-12-11 Micron Technology, Inc. Emerging bad block detection
US20090037662A1 (en) * 2007-07-30 2009-02-05 Lee Charles La Frese Method for Selectively Enabling and Disabling Read Caching in a Storage Subsystem
US20090228648A1 (en) 2008-03-04 2009-09-10 International Business Machines Corporation High performance disk array rebuild
US8352540B2 (en) 2008-03-06 2013-01-08 International Business Machines Corporation Distinguishing data streams to enhance data storage efficiency
US20090248863A1 (en) 2008-03-28 2009-10-01 Fujitsu Limited Analysis apparatus, analysis method and recording medium for recording analysis program
US7873619B1 (en) 2008-03-31 2011-01-18 Emc Corporation Managing metadata
US8621241B1 (en) 2008-04-25 2013-12-31 Netapp, Inc. Storage and recovery of cryptographic key identifiers
US8117464B1 (en) 2008-04-30 2012-02-14 Netapp, Inc. Sub-volume level security for deduplicated data
US20090300084A1 (en) 2008-05-29 2009-12-03 Red Hat, Inc. Set partitioning for encoding file system allocation metadata
US20090300280A1 (en) 2008-06-02 2009-12-03 Curtis Edward Jutzi Detecting data mining processes to increase caching efficiency
US20100058026A1 (en) 2008-08-27 2010-03-04 International Business Machines Corporation Loading entries into a tlb in hardware via indirect tlb entries
US20100057673A1 (en) 2008-09-04 2010-03-04 Boris Savov Reusable mapping rules for data to data transformation
US20100077205A1 (en) 2008-09-19 2010-03-25 Ekstrom Joseph J System and Method for Cipher E-Mail Protection
US20100082879A1 (en) 2008-09-26 2010-04-01 Mckean Brian D Priority command queues for low latency solid state drives
US20100106905A1 (en) 2008-10-29 2010-04-29 Kabushiki Kaisha Toshiba Disk array control device and storage device
US20100153641A1 (en) 2008-12-12 2010-06-17 Divya Jagadish Hierarchical storage management (hsm) for redundant array of independent disks (raid)
US20100153620A1 (en) 2008-12-17 2010-06-17 Mckean Brian Storage system snapshot assisted by SSD technology
WO2010071655A1 (en) 2008-12-19 2010-06-24 Hewlett-Packard Development Company, L.P. Redundant data storage for uniform read latency
US20100191897A1 (en) 2009-01-23 2010-07-29 Seagate Technology Llc System and method for wear leveling in a data storage device
JP2010211681A (en) 2009-03-12 2010-09-24 Toshiba Corp Storage device and virtual device
US20100250802A1 (en) 2009-03-26 2010-09-30 Arm Limited Data processing apparatus and method for performing hazard detection
US20100250882A1 (en) 2009-03-30 2010-09-30 International Business Machines Corporation Incremental backup of source to target storage volume
US8205065B2 (en) 2009-03-30 2012-06-19 Exar Corporation System and method for data deduplication
US20100281225A1 (en) 2009-04-30 2010-11-04 Inventec Corporation Data processing apparatus of basic input/output system
US20100287327A1 (en) 2009-05-06 2010-11-11 Via Telecom, Inc. Computing systems and methods for managing flash memory device
US20120137075A1 (en) 2009-06-09 2012-05-31 Hyperion Core, Inc. System and Method for a Cache in a Multi-Core Processor
US20110072300A1 (en) 2009-09-21 2011-03-24 Stmicroelectronics (Rousset) Sas Tearing-proof method for writing data in a nonvolatile memory
US20110145598A1 (en) 2009-12-16 2011-06-16 Smith Ned M Providing Integrity Verification And Attestation In A Hidden Execution Environment
US20110161559A1 (en) 2009-12-31 2011-06-30 Yurzola Damian P Physical compression of data with flat or systematic pattern
US20110167221A1 (en) 2010-01-06 2011-07-07 Gururaj Pangal System and method for efficiently creating off-site data volume back-ups
US20130046995A1 (en) 2010-02-23 2013-02-21 David Movshovitz Method and computer program product for order preserving symbol based encryption
US20110238634A1 (en) 2010-03-24 2011-09-29 Makoto Kobara Storage apparatus which eliminates duplicated data in cooperation with host apparatus, storage system with the storage apparatus, and deduplication method for the system
US20120023375A1 (en) 2010-07-23 2012-01-26 Salesforce.Com, Inc. Generating performance alerts
US20120036309A1 (en) 2010-08-05 2012-02-09 Ut-Battelle, Llc Coordinated garbage collection for raid array of solid state disks
US8965819B2 (en) * 2010-08-16 2015-02-24 Oracle International Corporation System and method for effective caching using neural networks
CN103370686A (en) 2010-09-15 2013-10-23 净睿存储股份有限公司 Scheduling of reconstructive I/O read operations in a storage environment
US9423967B2 (en) 2010-09-15 2016-08-23 Pure Storage, Inc. Scheduling of I/O writes in a storage environment
CN103370685A (en) 2010-09-15 2013-10-23 净睿存储股份有限公司 Scheduling of I/O writes in a storage environment
US9436396B2 (en) 2010-09-15 2016-09-06 Pure Storage, Inc. Scheduling of reconstructive I/O read operations in a storage environment
EP3082047A1 (en) 2010-09-28 2016-10-19 Pure Storage, Inc. Adaptive raid for an ssd environment
US20120117029A1 (en) 2010-11-08 2012-05-10 Stephen Gold Backup policies for using different storage tiers
US20120198175A1 (en) 2011-01-31 2012-08-02 Fusion-Io, Inc. Apparatus, system, and method for managing eviction of data
US20130227236A1 (en) 2011-03-18 2013-08-29 Fusion-Io, Inc. Systems and methods for storage allocation
US20120330954A1 (en) 2011-06-27 2012-12-27 Swaminathan Sivasubramanian System And Method For Implementing A Scalable Data Storage Service
US8751463B1 (en) 2011-06-30 2014-06-10 Emc Corporation Capacity forecasting for a deduplicating storage system
US9454476B2 (en) 2011-08-11 2016-09-27 Pure Storage, Inc. Logical sector mapping in a flash storage array
US8527544B1 (en) 2011-08-11 2013-09-03 Pure Storage Inc. Garbage collection in a storage system
US9454477B2 (en) 2011-08-11 2016-09-27 Pure Storage, Inc. Logical sector mapping in a flash storage array
US20130042052A1 (en) 2011-08-11 2013-02-14 John Colgrove Logical sector mapping in a flash storage array
US8806160B2 (en) 2011-08-16 2014-08-12 Pure Storage, Inc. Mapping in a storage system
US20130047029A1 (en) 2011-08-17 2013-02-21 Fujitsu Limited Storage system, storage control apparatus, and storage control method
US8700875B1 (en) 2011-09-20 2014-04-15 Netapp, Inc. Cluster view for storage devices
CN104025010B (en) 2011-09-30 2016-11-16 净睿存储股份有限公司 Variable length code in storage system
US20140229452A1 (en) 2011-10-06 2014-08-14 Hitachi, Ltd. Stored data deduplication method, stored data deduplication apparatus, and deduplication program
US20130091102A1 (en) 2011-10-11 2013-04-11 Netapp, Inc. Deduplication aware scheduling of requests to access data blocks
US20130205110A1 (en) 2012-02-06 2013-08-08 Doron Kettner Storage Device and Method for Selective Data Compression
US20130275391A1 (en) 2012-04-17 2013-10-17 Fusion-Io, Inc. Data Expiry in a Non-Volatile Device
US20130275656A1 (en) 2012-04-17 2013-10-17 Fusion-Io, Inc. Apparatus, system, and method for key-value pool identifier encoding
US20130283058A1 (en) 2012-04-23 2013-10-24 International Business Machines Corporation Preserving redundancy in data deduplication systems by encryption
US20130290648A1 (en) 2012-04-27 2013-10-31 Netapp, Inc. Efficient data object storage and retrieval
US20140229131A1 (en) 2012-05-04 2014-08-14 Lsi Corporation Retention-drift-history-based non-volatile memory read threshold optimization
US8874850B1 (en) 2012-05-10 2014-10-28 Netapp, Inc. Hierarchically tagged cache
US20130318314A1 (en) 2012-05-25 2013-11-28 Red Hat, Inc. Managing copies of data on multiple nodes using a data controller node to avoid transaction deadlock
US20130339303A1 (en) 2012-06-18 2013-12-19 Actifio, Inc. System and method for incrementally backing up out-of-band data
US8959305B1 (en) 2012-06-29 2015-02-17 Emc Corporation Space reclamation with virtually provisioned devices
US20140052946A1 (en) 2012-08-17 2014-02-20 Jeffrey S. Kimmel Techniques for opportunistic data storage
US20140068791A1 (en) 2012-08-31 2014-03-06 Cleversafe, Inc. Securely storing data in a dispersed storage network
US20140089730A1 (en) 2012-09-26 2014-03-27 Fujitsu Limited Storage control device and method for controlling storages
US20140101361A1 (en) 2012-10-08 2014-04-10 International Business Machines Corporation System supporting multiple partitions with differing translation formats
US20140143517A1 (en) 2012-11-19 2014-05-22 Hitachi, Ltd. Storage system
US20140172929A1 (en) 2012-12-14 2014-06-19 Jeffrey C. Sedayao Adaptive data striping and replication across multiple storage clouds for high availability and performance
US9436720B2 (en) 2013-01-10 2016-09-06 Pure Storage, Inc. Safety for volume operations
US20140201150A1 (en) 2013-01-11 2014-07-17 Commvault Systems, Inc. Single snapshot for multiple agents
US20140215129A1 (en) 2013-01-28 2014-07-31 Radian Memory Systems, LLC Cooperative flash memory control
US20140281308A1 (en) 2013-03-15 2014-09-18 Bracket Computing, Inc. Storage unit selection for virtualized storage units
US20140297989A1 (en) 2013-03-27 2014-10-02 Canon Kabushiki Kaisha Information processing apparatus and memory control method
US20140325115A1 (en) 2013-04-25 2014-10-30 Fusion-Io, Inc. Conditional Iteration for a Non-Volatile Device
US20150088805A1 (en) 2013-09-20 2015-03-26 Oracle International Corporation Automatic caching of scan and random access data in computing systems
EP3066610A1 (en) 2013-11-06 2016-09-14 Pure Storage, Inc. Data protection in a storage system using external secrets
US9516016B2 (en) 2013-11-11 2016-12-06 Pure Storage, Inc. Storage array password management
US20150234709A1 (en) 2014-02-20 2015-08-20 Fujitsu Limited Storage controller, storage system, and control method
US20150244775A1 (en) 2014-02-27 2015-08-27 Commvault Systems, Inc. Work flow management for an information management system
EP3120235A1 (en) 2014-03-20 2017-01-25 Pure Storage, Inc. Remote replication using mediums
US20150278534A1 (en) 2014-03-26 2015-10-01 Amazon Technologies, Inc. Electronic communication with secure screen sharing of sensitive information
US9513820B1 (en) 2014-04-07 2016-12-06 Pure Storage, Inc. Dynamically controlling temporary compromise on data redundancy
US20160019114A1 (en) 2014-07-15 2016-01-21 Nimble Storage, Inc. Methods and systems for storing data in a redundant manner on a plurality of storage units of a storage system
US20160098191A1 (en) 2014-10-07 2016-04-07 Pure Storage, Inc. Optimizing replication by distinguishing user and system write activity
US20160098199A1 (en) 2014-10-07 2016-04-07 Pure Storage, Inc. Utilizing unmapped and unknown states in a replicated storage system
US9552248B2 (en) 2014-12-11 2017-01-24 Pure Storage, Inc. Cloud alert to replica

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Microsoft Corporation, "Fundamentals of Garbage Collection", Retrieved Aug. 30, 2013 via the WayBack Machine, 11 pages.
Microsoft Corporation, "GCSettings.IsServerGC Property", Retrieved Oct. 27, 2013 via the WayBack Machine, 3 pages.

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10191857B1 (en) * 2014-01-09 2019-01-29 Pure Storage, Inc. Machine learning for metadata cache management
US20170083447A1 (en) * 2015-09-22 2017-03-23 EMC IP Holding Company LLC Method and apparatus for data storage system
US10860493B2 (en) * 2015-09-22 2020-12-08 EMC IP Holding Company LLC Method and apparatus for data storage system
US11016909B2 (en) 2019-08-26 2021-05-25 International Business Machines Corporation Cache page retention based on page cost

Also Published As

Publication number Publication date
US10191857B1 (en) 2019-01-29
US9208086B1 (en) 2015-12-08

Similar Documents

Publication Publication Date Title
US10191857B1 (en) Machine learning for metadata cache management
CN108804031B (en) Optimal record lookup
US9589008B2 (en) Deduplication of volume regions
US9582222B2 (en) Pre-cache similarity-based delta compression for use in a data storage system
US8311964B1 (en) Progressive sampling for deduplication indexing
US9413527B2 (en) Optimizing signature computation and sampling for fast adaptive similarity detection based on algorithm-specific performance
US9569357B1 (en) Managing compressed data in a storage system
US9779026B2 (en) Cache bypass utilizing a binary tree
Meister et al. Block locality caching for data deduplication
US10409728B2 (en) File access predication using counter based eviction policies at the file and page level
US9658957B2 (en) Systems and methods for managing data input/output operations
US20120297142A1 (en) Dynamic hierarchical memory cache awareness within a storage system
US9069680B2 (en) Methods and systems for determining a cache size for a storage system
Park et al. A lookahead read cache: improving read performance for deduplication backup storage
Wu et al. A differentiated caching mechanism to enable primary storage deduplication in clouds
JP2017049806A (en) Storage control device and storage control program
WO2012109145A2 (en) Pre-cache similarity-based delta compression for use in a data storage system
US11294872B2 (en) Efficient database management system and method for use therewith
Elyasi et al. Content popularity-based selective replication for read redirection in ssds
US20140258634A1 (en) Allocating Enclosure Cache In A Computing System
US9354820B2 (en) VSAM data set tier management
US9864761B1 (en) Read optimization operations in a storage system
US11914527B2 (en) Providing a dynamic random-access memory cache as second type memory per application process
US11340822B2 (en) Movement of stored data based on occurrences of one or more n-gram strings in the stored data
US11099756B2 (en) Managing data block compression in a storage system

Legal Events

Date Code Title Description
AS Assignment

Owner name: PURE STORAGE, INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:SHALEV, ORI;REEL/FRAME:037027/0303

Effective date: 20140109

STCF Information on status: patent grant

Free format text: PATENTED CASE

FEPP Fee payment procedure

Free format text: MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

LAPS Lapse for failure to pay maintenance fees

Free format text: PATENT EXPIRED FOR FAILURE TO PAY MAINTENANCE FEES (ORIGINAL EVENT CODE: EXP.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

STCH Information on status: patent discontinuation

Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362

FP Lapsed due to failure to pay maintenance fee

Effective date: 20211031