Mapping Parallelism to Multi-cores: A Machine Learning Based Approach, PPoPP’09
http://portal.acm.org/citation.cfm?id=1504176.1504189
They extracted the static code features like operations, control flows, memory access and binary & bitwise operations using LLVM, and got data features like loop counts of kernels, L1 dcache miss and branch miss ratio using PMU, and runtime feature, execution time. These are the inputs to ANN, and the outputs of ANN are the best predicted schedule policy and the predicted speedup. The problem is that the internal of ANN is a ‘blackbox’.
Zheng Wang