性能主题
ArrayList与LinkedList性能比较
使用一种称为Hardware Performance Counters.方式来比较这两者,从CPU循环 本地缓存等等最底层的指标来衡量。
public class ListIteration
{
private static List<String> arrayList = new ArrayList<>();
private static List<String> linkedList = new LinkedList<>();
public static void initializeList(List<String> list, int bufferSize)
{
for (int i = 0; i < 50000; i++)
{
byte[] buffer = null;
if (bufferSize > 0)
{
buffer = new byte[bufferSize];
}
String s = String.valueOf(i);
list.add(s);
// avoid buffer to be optimized away
if (System.currentTimeMillis() == 0)
{
System.out.println(buffer);
}
}
}
public static void bench(List<String> list)
{
if (list.contains("bar"))
{
System.out.println("bar found");
}
}
public static void main(String[] args) throws Exception
{
if (args.length != 2) return;
List<String> benchList = "array".equals(args[0]) ? arrayList : linkedList;
int bufferSize = Integer.parseInt(args[1]);
initializeList(benchList, bufferSize);
HWCounters.init();
System.out.println("init done");
// warmup
for (int i = 0; i < 10000; i++)
{
bench(benchList);
}
Thread.sleep(1000);
System.out.println("warmup done");
HWCounters.start();
for (int i = 0; i < 1000; i++)
{
bench(benchList);
}
HWCounters.stop();
HWCounters.printResults();
HWCounters.shutdown();
}
}
运行在on Linux with 2 Xeon X5680,没有buffer情况:
[root@archi-srv]# java -cp .:overseer.jar com.ullink.perf.myths.ListIteration array 0
init done
warmup done
Cycles: 428,711,720
Instructions: 776,215,597
L2 hits: 5,302,792
L2 misses: 23,702,079
LLC hits: 42,933,789
LLC misses: 73
CPU migrations: 0
Local DRAM: 0
Remote DRAM: 0
[root@archi-srv]# /java -cp .:overseer.jar com.ullink.perf.myths.ListIteration linked 0
init done
warmup done
Cycles: 767,019,336
Instructions: 874,081,196
L2 hits: 61,489,499
L2 misses: 2,499,227
LLC hits: 3,788,468
LLC misses: 0
CPU migrations: 0
Local DRAM: 0
Remote DRAM: 0
- cycles值是 CPU 用来执行代码的循环次数,LinkedList相比ArrayList会花费更多CPU循环.
- Instructions方面 LinkedList有点高. 但是不是很明显
- L2 cache 访问的区别:ArrayList有显著 L2 丢失率,与LinkedList相比
- LLC hits 对于ArrayList很重要
下面是有buffer(intermediary buffer )的测试结果:
[root@archi-srv]# java -cp .:overseer.jar com.ullink.perf.myths.ListIteration array 256
init done
warmup done
Cycles: 584,965,201
Instructions: 774,373,285
L2 hits: 952,193
L2 misses: 62,840,804
LLC hits: 63,126,049
LLC misses: 4,416
CPU migrations: 0
Local DRAM: 824
Remote DRAM: 0
[root@archi-srv]# java -cp .:overseer.jar com.ullink.perf.myths.ListIteration linked 256
init done
warmup done
Cycles: 5,289,317,879
Instructions: 874,350,022
L2 hits: 1,487,037
L2 misses: 75,500,984
LLC hits: 81,881,688
LLC misses: 5,826,435
CPU migrations: 0
Local DRAM: 1,645,436
Remote DRAM: 1,042
- LinkedList的CPU循环是ArrayList的十倍。
- Instructions和之前一样
- 缓存访问方面, ArrayList相比之前 有更多的 L2 丢失/LLC 击中, 但是LLC丢失还算可以,而LinkedList虽然也有很多 L2 丢失/LLC 击中, 但是LLC 丢失/local DRAM 访问却高得多,两者区别就在这里。
ArrayList 能够让CPU运行更有预期,CPU自身的优化起作用,在遍历方面,ArrayList 也是更稳定一些,