"The JSR-133 Cookbook for Compiler Writers"
original website is http://gee.cs.oswego.edu/dl/jmm/cookbook.html. by Doug Lea, with help from members of the JMM mailing list.
Chinese edition is translated by 崔新
Preface: Over the 10+ years since this was initially written, many processor and language memory model specifications and issues have become clearer and better understood. And many have not. While this guide is maintained to remain accurate, it is incomplete about some of these evolving details. For more extensive coverage, see especially the work of Peter Sewell and the Cambridge Relaxed Memory Concurrency Group
前言: 自此指南最初编写以来已有十多年,许多处理器和语言内存模型规范和问题已经变得更加清晰和更好地被理解。 然而许多还没有。 尽管本指南一直被维护着以保持准确性,但关于一些不断发展的细节,此指南给出的内容并不完整。 有关更广泛的报道,尤其要参见 Peter Sewell 和 Cambridge Relaxed Memory Concurrency Group 的工作。
This is an unofficial guide to implementing the new Java Memory Model (JMM) specified by JSR-133 . It provides at most brief backgrounds about why various rules exist, instead concentrating on their consequences for compilers and JVMs with respect to instruction reorderings, multiprocessor barrier instructions, and atomic operations. It includes a set of recommended recipes for complying to JSR-133. This guide is "unofficial" because it includes interpretations of particular processor properties and specifications. We cannot guarantee that the intepretations are correct. Also, processor specifications and implementations may change over time.
这是实现由 JSR-133 规范的新 Java Memory Model (JMM) 的非官方指南。 它提供了有关为什么存在各种规则的最简短的背景,而不是专注于它们在指令重新排序,多处理器屏障指令和原子操作方面对编译器和JVM的影响。 它包括一组符合 JSR-133 的推荐食谱。 本指南是“非官方的”,因为它包含对特定处理器属性和规范的解释。 我们不能保证解释是正确的。 此外,处理器规范和实现可能会随时间而变化。 p>
For a compiler writer, the JMM mainly consists of rules disallowing reorderings of certain instructions that access fields (where "fields" include array elements) as well as monitors (locks).
对于一个编译器编写者来说,JMM主要由禁止对访问字段(其中“字段”包括数组元素)和监视器(锁)的某些指令进行重排序的规则组成。
Can Reorder | 2nd operation | |||
1st operation | Normal Load Normal Store |
Volatile Load MonitorEnter |
Volatile Store MonitorExit |
|
Normal Load Normal Store |
No | |||
Volatile Load MonitorEnter |
No | No | No | |
Volatile store MonitorExit |
No | No |
Where:
The cells for Normal Loads are the same as for Normal Stores, those for Volatile Loads are the same as MonitorEnter, and those for Volatile Stores are same as MonitorExit, so they are collapsed together here (but are expanded out as needed in subsequent tables). We consider here only variables that are readable and writable as an atomic unit -- that is, no bit fields, unaligned accesses, or accesses larger than word sizes available on a platform.
Normal Loads 的单元格与 Normal Stores 的单元格相同, Volatile Loads 的单元格与 MonitorEnter 相同, 而 Volatile Stores 的单元格与 MonitorExit 相同,因此它们在此处折叠在一起(但根据需要在后续表格被展开)。 在这里,我们仅考虑以原子单位可读写的变量 —— 即没有位字段,未对齐的访问或大于平台上可用字长的访问。
Any number of other operations might be present between the indicated 1st and 2nd operations in the table. So, for example, the "No" in cell [Normal Store, Volatile Store] says that a non-volatile store cannot be reordered with ANY subsequent volatile store; at least any that can make a difference in multithreaded program semantics.
表中指示的 1st 和 2nd 操作之间可能存在任意数量的其他操作。 因此,例如,[Normal Store, Volatile Store]单元格中的"No"表示, 一个 非volatile存储 不能与任何后续的 voaltile存储 一起重排序; 至少是任何在多线程程序语义上有影响的重排序。
The JSR-133 specification is worded such that the rules for both volatiles and monitors apply only to those that may be accessed by multiple threads. If a compiler can somehow (usually only with great effort) prove that a lock is only accessible from a single thread, it may be eliminated. Similarly, a volatile field provably accessible from only a single thread acts as a normal field. More fine-grained analyses and optimizations are also possible, for example, those relying on provable inaccessibility from multiple threads only during certain intervals.
JSR-133规范的措辞使得 volatile 和监视器的规则仅适用于可由多个线程访问的规则。 如果编译器可以用某种方式(通常要花费很大的精力)证明一个锁仅对单个线程可访问,那么该锁可能会被消除。 类似地,可证明仅对单个线程可访问的 volaitle 字段可以当成普通字段。 更细粒度的分析和优化也是可能的,例如,那些依赖于仅在特定时间间隔内对多线程可证明不可访问的分析和优化。
Blank cells in the table mean that the reordering is allowed if the accesses aren't otherwise dependent with respect to basic Java semantics (as specified in the JLS). For example even though the table doesn't say so, you can't reorder a load with a subsequent store to the same location. But you can reorder a load and store to two distinct locations, and may wish to do so in the course of various compiler transformations and optimizations. This includes cases that aren't usually thought of as reorderings; for example reusing a computed value based on a loaded field rather than reloading and recomputing the value acts as a reordering. However, the JMM spec permits transformations that eliminate avoidable dependencies, and in turn allow reorderings.
表中的空白单元格表示,重排序是允许的,如果那些访问不依赖于基本的 Java 语义(如 JLS 所规范的)。 例如,即使表中没有这样说,你也不能将一个加载与一个后续对同一位置的存储重排序。 但是你可以将对两个不同位置的加载和存储重排序,并且可能希望在各种编译器转换和优化过程中这样做。 这包括通常不认为是重排序的情况; 例如,重用基于一个加载的字段得到的一个计算值,而不是重新加载并重新计算该值(与重排序行为一致)。 但是,JMM 规范允许进行转换,从而消除了可避免的依赖关系,进而允许重排序。
In all cases, permitted reorderings must maintain minimal Java safety properties even when accesses are incorrectly synchronized by programmers: All observed field values must be either the default zero/null "pre-construction" values, or those written by some thread. This usually entails zeroing all heap memory holding objects before it is used in constructors and never reordering other loads with the zeroing stores. A good way to do this is to zero out reclaimed memory within the garbage collector. See the JSR-133 spec for rules dealing with other corner cases surrounding safety guarantees.
在所有情况下,允许的重排序必须保持最小的 Java 安全属性,即使当那些访问被程序员不正确地同步的时候: 所有观察到的字段值都必须是默认的 zero/null "pre-construction"值,或者是由某个线程写入的值。 这通常需要在构造函数使用堆内存之前将持有对象的所有堆内存清零,还需要永远不会将零存储(zeroing stores)与其他存储重排序。 实现上述要求的一个好方法是将垃圾回收器中回收的内存清零。 处理围绕安全保证的其他特殊情况的相关规则,请参见 JSR-133 规范。
The rules and properties described here are for accesses to Java-level fields. In practice, these will additionally interact with accesses to internal bookkeeping fields and data, for example object headers, GC tables, and dynamically generated code.
此处描述的规则和属性用于访问 Java-level 的字段。 实际上,它们还将与对内部记录字段和数据(例如对象头,GC表和动态生成的代码)的访问进行交互。
Loads and Stores of final fields act as "normal" accesses with respect to locks and volatiles, but impose two additional reordering rules:
final字段的加载和存储就锁和volatile而言是“普通”访问,但是强加了两个附加的重排序规则:
These rules imply that reliable use of final fields by Java programmers requires that the load of a shared reference to an object with a final field itself be synchronized, volatile, or final, or derived from such a load, thus ultimately ordering the initializing stores in constructors with subsequent uses outside constructors.
这些规则暗示: Java 程序员对 final 字段的可靠使用存在要求, 该要求是对带有 final 字段的对象的共享引用的加载本身必须是 synchronized,volatile 或 final,或者是从此类加载派生来的, 因而最终将构造函数中的初始化存储与构造函数外的后续使用排序。
Compilers and processors must both obey reordering rules. No particular effort is required to ensure that uniprocessors maintain proper ordering, since they all guarantee "as-if-sequential" consistency. But on multiprocessors, guaranteeing conformance often requires emitting barrier instructions. Even if a compiler optimizes away a field access (for example because a loaded value is not used), barriers must still be generated as if the access were still present. (Although see below about independently optimizing away barriers.)
编译器和处理器都必须遵守重排序规则。 不需要特别的努力来确保单处理器保持适当的排序,因为它们都保证 "as-if-sequential" 一致性。 但是在多处理器上,要保证一致性,通常需要调用屏障指令。 即使编译器优化掉了一个字段访问(例如,因为一个加载的值未被使用),屏障也必须仍然被生成,就像访问仍然存在一样。 (但是可参阅下面有关独立地优化掉屏障的信息。)
Memory barriers are only indirectly related to higher-level notions described in memory models such as "acquire" and "release". And memory barriers are not themselves "synchronization barriers". And memory barriers are unrelated to the kinds of "write barriers" used in some garbage collectors. Memory barrier instructions directly control only the interaction of a CPU with its cache, with its write-buffer that holds stores waiting to be flushed to memory, and/or its buffer of waiting loads or speculatively executed instructions. These effects may lead to further interaction among caches, main memory and other processors. But there is nothing in the JMM that mandates any particular form of communication across processors so long as stores eventually become globally performed; i.e., visible across all processors, and that loads retrieve them when they are visible.
内存屏障仅与内存模型中描述的更高级概念(例如 "acquire" 和 "release")间接相关。 并且内存屏障本身并不是"同步屏障"("synchronization barriers")。 并且内存屏障与某些垃圾收集器中使用的"写屏障"("write barriers")的种类无关。 内存屏障指令仅直接控制 CPU 与该 CPU 的高速缓存,该 CPU的的写入缓冲区(保存等待刷新到主存的存储),和/或该 CPU 的等待加载的缓冲区或推测执行的指令的交互。 这些影响可能导致多个高速缓存,主存和其他多个处理器之间的进一步交互。 但是,只要存储最终在全局执行,JMM 中就没有什么要求在处理器之间进行任何特定形式的通信; 即在所有处理器上均可见,并且在可见时加载会获取它们。
Nearly all processors support at least a coarse-grained barrier instruction, often just called a Fence, that guarantees that all loads and stores initiated before the fence will be strictly ordered before any load or store initiated after the fence. This is usually among the most time-consuming instructions on any given processor (often nearly as, or even more expensive than atomic instructions). Most processors additionally support more fine-grained barriers.
几乎所有处理器都至少支持一个粗粒度的屏障指令,通常称为一个栅栏(Fence), 该栅栏可确保在该栅栏之前的所有加载和存储都会被严格排序在在该栅栏之后的任何加载或存储之前。 这通常是在任何给定处理器上最耗时的指令之一(通常与原子指令几乎一样,甚至比原子指令更昂贵)。 大多数处理器还支持更多细粒度的屏障
A property of memory barriers that takes some getting used to is that they apply BETWEEN memory accesses. Despite the names given for barrier instructions on some processors, the right/best barrier to use depends on the kinds of accesses it separates. Here's a common categorization of barrier types that maps pretty well to specific instructions (sometimes no-ops) on existing processors:
内存屏障的一项属性(该属性需要一些时间来习惯),它们会应用在内存访问之间。 尽管在某些处理器上为屏障指令指定了名称,但要使用的正确/最佳的屏障取决于它分隔的访问类型。 下面是屏障类型的一个常见分类,该分类可以很好地映射到现有处理器上的特定指令(有时是 no-ops):
On all processors discussed below, it turns out that instructions that perform StoreLoad also obtain the other three barrier effects, so StoreLoad can serve as a general-purpose (but usually expensive) Fence. (This is an empirical fact, not a necessity.) The opposite doesn't hold though. It is NOT usually the case that issuing any combination of other barriers gives the equivalent of a StoreLoad.
在下面讨论的所有处理器上,事实证明执行 StoreLoad 的指令也获得了其他三种屏障效果, 因此 StoreLoad 可用作通用(但通常很贵)的 Fence。 (这是一个经验事实,不是必须的。) 相反情况并不成立。 调用其他屏障的任意组合相当于 StoreLoad, 通常不是这种情况。
The following table shows how these barriers correspond to JSR-133 ordering rules.
下表显示了这些屏障如何与 JSR-133 排序规则相对应。
Required barriers | 2nd operation | |||
1st operation | Normal Load | Normal Store | Volatile Load MonitorEnter |
Volatile Store MonitorExit |
Normal Load | LoadStore | |||
Normal Store | StoreStore | |||
Volatile Load MonitorEnter |
LoadLoad | LoadStore | LoadLoad | LoadStore |
Volatile Store MonitorExit |
StoreLoad | StoreStore |
Plus the special final-field rule requiring a StoreStore barrier
in
x.finalField = v; StoreStore; sharedRef = x;
Here's an example showing placements.
加上特殊的final字段规则,该规则要求在下面语句中需要一个StoreStore屏障
x.finalField = v; StoreStore; sharedRef = x;
下面是显示展示位置的示例。
Java |
Instructions |
class X { int a, b; volatile int v, u; void f() { int i, j; i = a; j = b; i = v; j = u; a = i; b = j; v = i; u = j; i = u; j = b; a = i; } } |
load a load b load v LoadLoad load u LoadStore store a store b StoreStore store v StoreStore store u StoreLoad load u LoadLoad LoadStore load b store a |
The need for LoadLoad and LoadStore barriers on some processors interacts
with their ordering guarantees for dependent instructions. On some
(most) processors, a load or store that is dependent on the value of a
previous load are ordered by the processor without need for an
explicit barrier. This commonly arises in two kinds of cases,
indirection:
Load x; Load x.field
and control
Load x; if (predicate(x)) Load or Store y;
一些处理器上对 LoadLoad 和 LoadStore 屏障的需求与其对相关指令的排序保证相互影响。 在某些(大多数)处理器上,一个加载或存储(该操作依赖于之前加载的值)被处理器排序时并不需要一个显式的屏障。
这通常在两种情况下出现,
间接:
Load x; Load x.field
和控制:
Load x; if (predicate(x)) Load or Store y;
Processors that do NOT respect indirection ordering in
particular require barriers for final field access for references
initially obtained through shared references:
x = sharedRef; ... ; LoadLoad; i = x.finalField;
不考虑间接排序的处理器尤其需要对引用(该引用最初通过共享引用获得)进行 final 字段访问的屏障:
x = sharedRef; ... ; LoadLoad; i = x.finalField;
Conversely, as discussed below, processors that DO respect data dependencies provide several opportunities to optimize away LoadLoad and LoadStore barrier instructions that would otherwise need to be issued. (However, dependency does NOT automatically remove the need for StoreLoad barriers on any processor.)
相反,如下所述,要尊重数据依赖性的处理器为优化掉 LoadLoad 和 LoadStore 屏障指令(否则这些指令需要被调用)提供了几个机会。 (但是,依赖关系不会自动消除任何处理器上对 StoreLoad 屏障的需求。)
The kinds of barriers needed on different processors further interact with implementation of MonitorEnter and MonitorExit. Locking and/or unlocking usually entail the use of atomic conditional update operations CompareAndSwap (CAS) or LoadLinked/StoreConditional (LL/SC) that have the semantics of performing a volatile load followed by a volatile store. While CAS or LL/SC minimally suffice, some processors also support other atomic instructions (for example, an unconditional exchange) that can sometimes be used instead of or in conjunction with atomic conditional updates.
在不同处理器上需要的各种屏障进一步与 MonitorEnter 和 MonitorExit 的实现交互。 锁定和/或解锁通常需要使用原子条件更新操作 CompareAndSwap(CAS) 或 LoadLinked/StoreConditional(LL/SC),这些操作具有一个 volatile 加载,然后跟着一个 volatile 存储的语义。 尽管 CAS或 LL/SC 最低限度地满足了使用,但某些处理器还支持其他原子指令(例如,无条件交换),这些指令有时可以用来代替原子条件更新或与原子条件更新结合使用。
On all processors, atomic operations protect against read-after-write problems for the locations being read/updated. (Otherwise standard loop-until-success constructions wouldn't work in the desired way.) But processors differ in whether atomic instructions provide more general barrier properties than the implicit StoreLoad for their target locations. On some processors these instructions also intrinsically perform barriers that would otherwise be needed for MonitorEnter/Exit; on others some or all of these barriers must be specifically issued.
在所有处理器上,原子操作可以防止正在被读取/更新的位置的 read-after-write 问题。 (否则,标准的 loop-until-success 结构无法按预期的方式工作。) 但是处理器的区别在于原子指令是否为其目标地址提供比隐式 StoreLoad 更通用的屏障属性。 在某些处理器上,这些指令还从根本上执行了 MonitorEnter/Exit 所需的屏障。 在其他处理器上,所有或部分这些屏障必须明确调用。
Volatiles and Monitors have to be separated to disentangle these effects, giving:
volatile 和监视器必须分开才能消除这些影响,从而得到:
Required Barriers | 2nd operation | |||||
1st operation | Normal Load | Normal Store | Volatile Load | Volatile Store | MonitorEnter | MonitorExit |
Normal Load | LoadStore | LoadStore | ||||
Normal Store | StoreStore | StoreExit | ||||
Volatile Load | LoadLoad | LoadStore | LoadLoad | LoadStore | LoadEnter | LoadExit |
Volatile Store | StoreLoad | StoreStore | StoreEnter | StoreExit | ||
MonitorEnter | EnterLoad | EnterStore | EnterLoad | EnterStore | EnterEnter | EnterExit |
MonitorExit | ExitLoad | ExitStore | ExitEnter | ExitExit |
Plus the special final-field rule requiring a StoreStore barrier in:
x.finalField = v; StoreStore; sharedRef =
x;
加上特殊的 final 字段规则,该规则要求在以下位置添加 StoreStore 屏障:
x.finalField = v; StoreStore; sharedRef =
x;
In this table, "Enter" is the same as "Load" and "Exit" is the same as "Store", unless overridden by the use and nature of atomic instructions. In particular:
在此表中, "Enter" 与 "Load" 相同,"Exit" 与 "Store" 相同,除非被原子指令的使用和性质所覆盖。 特别是:
The other types are specializations that are unlikely to play a role in compilation (see below) and/or reduce to no-ops on current processors. For example, EnterEnter is needed to separate nested MonitorEnters when there are no intervening loads or stores. Here's an example showing placements of most types:
其他类型是专门化,它们不太可能在编译中起作用(请参阅下文)和/或在当前处理器上简化为 no-ops。 例如,当没有中间加载或存储时,需要 EnterEnter 来分隔嵌套的 MonitorEnters。 下面是一个显示大多数类型位置的示例:
Java |
Instructions |
class X { int a; volatile int v; void f() { int i; synchronized(this) { i = a; a = i; } synchronized(this) { synchronized(this) { } } i = v; synchronized(this) { } v = i; synchronized(this) { } } } |
enter EnterLoad EnterStore load a store a LoadExit StoreExit exit ExitEnter enter EnterEnter enter EnterExit exit ExitExit exit ExitEnter ExitLoad load v LoadEnter enter EnterExit exit ExitEnter ExitStore store v StoreEnter enter EnterExit exit |
Java-level access to atomic conditional update operations will be available in JDK1.5 via JSR-166 (concurrency utilities) so compilers will need to issue associated code, using a variant of the above table that collapses MonitorEnter and MonitorExit -- semantically, and sometimes in practice, these Java-level atomic updates act as if they are surrounded by locks.
在 Java1.5 中 通过 JSR-166 (concurrency utilities)对原子条件更新操作进行 Java-level 访问将是可用的, 因此编译器将需要使用上表的变体来调用关联的代码,该变体将 MonitorEnter 和 MonitorExit 折叠起来 —— 从语义上讲,有时在实践中,这些 Java-level 原子更新的行为就像被锁包围一样。
Here's a listing of processors that are commonly used in MPs, along with links to documents providing information about them. (Some require some clicking around from the linked site and/or free registration to access manuals). This isn't an exhaustive list, but it includes processors used in all current and near-future multiprocessor Java implementations I know of. The list and the properties of processors decribed below are not definitive. In some cases I'm just reporting what I read, and could have misread. Several reference manuals are not very clear about some properties relevant to the JMM. Please help make it definitive.
下面是 MPs 中常用处理器的列表,以及提供有关它们的信息的文档的链接。 (有些需要在链接的站点上单击一下和/或免费注册才能访问手册)。 这并不是一个详尽的清单,但它包括了在我所知道的所有当前和不久将来的多处理器 Java 实现中使用的处理器。 下面描述的处理器的列表和属性不是明确的。 在某些情况下,我只是在报告自己所读的内容,并且可能会误读。 一些参考手册对于与 JMM 相关的某些属性不是很清楚。请帮助使其明确。
Good sources of hardware-specific information about barriers and related properties of machines not listed here are Hans Boehm's atomic_ops library, the Linux Kernel Source, and Linux Scalability Effort. Barriers needed in the linux kernel correspond in straightforward ways to those discussed here, and have been ported to most processors. For descriptions of the underlying models supported on different processors, see Sarita Adve et al, Recent Advances in Memory Consistency Models for Hardware Shared-Memory Systems and Sarita Adve and Kourosh Gharachorloo, Shared Memory Consistency Models: A Tutorial.
此处未列出的有关机器的屏障和相关属性的特定硬件信息的良好来源是 Hans Boehm's atomic_ops library,Linux Kernel Source 和 Linux Scalability Effort。 linux 内核中所需的屏障以直接的方式对应于此处讨论的屏障,并且已移植到大多数处理器中。 有关不同处理器支持的基础模型的说明,请参见 Sarita Adve et al, Recent Advances in Memory Consistency Models for Hardware Shared-Memory Systems 和 Sarita Adve and Kourosh Gharachorloo, Shared Memory Consistency Models: A Tutorial。
Here's how these processors support barriers and atomics:
以下是这些处理器如何支持屏障和原子的:
Processor | LoadStore | LoadLoad | StoreStore | StoreLoad | Data dependency orders loads? |
Atomic Conditional |
Other Atomics |
Atomics provide barrier? |
sparc-TSO | no-op | no-op | no-op | membar (StoreLoad) |
yes | CAS: casa |
swap, ldstub |
full |
x86 | no-op | no-op | no-op | mfence or cpuid or locked insn |
yes | CAS: cmpxchg |
xchg, locked insn |
full |
ia64 | combine with st.rel or ld.acq |
ld.acq | st.rel | mf | yes | CAS: cmpxchg |
xchg, fetchadd |
target + acq/rel |
arm | dmb (see below) |
dmb (see below) |
dmb-st | dmb | indirection only |
LL/SC: ldrex/strex |
target only |
|
ppc | lwsync (see below) |
hwsync (see below) |
lwsync | hwsync | indirection only |
LL/SC: ldarx/stwcx |
target only |
|
alpha | mb | mb | wmb | mb | no | LL/SC: ldx_l/stx_c |
target only |
|
pa-risc | no-op | no-op | no-op | no-op | yes | build from ldcw |
ldcw | (NA) |
Many of these barriers usually reduce to no-ops. In fact, most of them reduce to no-ops, but in different ways under different processors and locking schemes. For the simplest examples, basic conformance to JSR-133 on x86 or sparc-TSO using CAS for locking amounts only to placing a StoreLoad barrier after volatile stores.
许多这些屏障通常简化为 no-ops。 实际上,它们大多数简化为 no-ops,但是在不同的处理器和锁定方案下以不同的方式进行。 对于最简单的示例,使用 CAS 锁定数量的 x86 或 sparc-TSO 上的 JSR-133 基本一致性仅用于在 volatile 存储之后放置一个 StoreLoad 屏障。
The conservative strategy above is likely to perform acceptably for many programs. The main performance issues surrounding volatiles occur for the StoreLoad barriers associated with stores. These ought to be relatively rare -- the main reason for using volatiles in concurrent programs is to avoid the need to use locks around reads, which is only an issue when reads greatly overwhelm writes. But this strategy can be improved in at least the following ways:
上面的保守策略可能在许多程序中都能令人满意地执行。 围绕 volatiles 的主要性能问题会发生在与存储关联的 StoreLoad 屏障。 这些应该相对较少 —— 在并发程序中使用 volatile 的主要原因是避免需要在读取周围使用锁,这仅在读取大大超过写入的情况下才是问题。 但是,至少可以通过以下方式改进此策略:
Original | => | Transformed | ||||
1st | ops | 2nd | => | 1st | ops | 2nd |
LoadLoad | [no loads] | LoadLoad | => | [no loads] | LoadLoad | |
LoadLoad | [no loads] | StoreLoad | => | [no loads] | StoreLoad | |
StoreStore | [no stores] | StoreStore | => | [no stores] | StoreStore | |
StoreStore | [no stores] | StoreLoad | => | [no stores] | StoreLoad | |
StoreLoad | [no loads] | LoadLoad | => | StoreLoad | [no loads] | |
StoreLoad | [no stores] | StoreStore | => | StoreLoad | [no stores] | |
StoreLoad | [no volatile loads] | StoreLoad | => | [no volatile loads] | StoreLoad |
Similar eliminations can be used for interactions with locks,
but depend on how locks are implemented. Doing all this in the presence
of loops, calls, and branches is left as an exercise for the
reader. :-)
类似的消除方法可用于与锁进行交互,但要取决于实现锁的方式。
在存在循环,调用和分支的情况下进行所有这些操作留给读者作为练习。 :-)