Principles
Cache memory is intended to give memory speed approaching that of the fastest memories available, and at the same time provide a large memory size at the price of less expensive types of semiconductor memories .The concept is illustrated in Figure 4.13.There is a relatively large and slow main memory together with a smaller, faster cache memory. The cache contains a copy of portions of main memory. When the processor attempts to read a word of memory, a check is made to determine if the word is in the cache. If so, the word is delivered to the processor. If not, a block of main memory, consisting of some fixed number of words, is read into the cache and then the word is delivered to the processor, Because of the phenomenon of locality of reference, when a block of date is fetched into the cache to satisfy a single memory reference, it is likely that future references will be to other words in the block,
Figure 4.14 depicts the structure of a cache/main-memory system. Main memory consists of up to 2n addressable words, with each word having a unique n-bit address. For mapping purposes, this memory is considered to consist of a number of fixed-length blocks of K words each. That is, there are M-2n/K blocks. Cache consists of C lines of K words each, and the number of lines is considerably less than the number of main memory blocks (C<<M).At any time, some subset of the blocks of memory resides in lines in the cache. If a word in a block of memory is read, that block is transferred to one of the lines of the cache. Because there are more blocks than lines, an individual line cannot be uniquely and permanently dedicated to a particular block. Thus, each line includes a tag that identifies which particular block is currently being stored. The tag is usually a portion of the main memory address, as described later in this section.
Figure 4.15 illustrates the read operation. The processor generates the address, RA, of a word to be read. If the word is contained in the cache, it is delivered to the processor. Otherwise, the block containing that word is loaded into the cache, and
The word is delivered to the processor. Figure 4.15 shows these last two operations occurring in parallel and reflects the organization shown in Figure 4.16, which is typical of contemporary cache organizations. In this organization, the cache connects to the processor via data, control, and address lines. The data and address lines also attach to data and address buffers, which attach to a system bus from which main memory is reached .When a cache hit occurs, the data and address buffers are disabled and communication is only between processor and cache, with no system bus and the data are returned through the data buffer to both the cache and main memory. In other organizations, the cache is physically interposed between the processor and the main memory for all data, address, and control lines. In this latter case, for a cache miss, the desired word is first read into the cache and then transferred from cache to processor.
A discussion of the performance parameters related to cache use is contained in Appendix 4A
------解决方案--------------------
http://translate.google.com.hk/# google翻译的
原则
快取记忆体的目的是给内存速度接近最快的可用记忆体,并在同一时间提供较昂贵的类型的半导体存储器的价格在大内存的大小。这个概念是在图4.13.There说明是一个比较大缓慢的主内存更小,更快的快取记忆体。高速缓存中包含一个主内存部分的副本。当处理器试图读取内存的一个字,一个检查,以确定是否这个词是在高速缓存中。如果是这样,这个词是传送到处理器。主内存块,如果没有,一些固定数目字组成,读入缓存中,然后交付字处理器,因为局部性的现象,当一个块的日期是牵强的缓存,以满足一个内存引用,它可能是未来的提述,将在该块换句话说,