Dominik Grewe

A Static Task Partitioning Approach for Heterogeneous Systems Using OpenCL

http://??

The authors propose a purely static approach based on predictive modeling and program features.  They extract static code features from OpenCL kernel source codes and runtime pass the run-time information such as kernel arguments or index space to the model. Using Support Vector Machines, they partition and mapping the workload of OpenCL kernel on heterogeneous CPU-GPU systems. However, the static features is less accurate than runtime features. Therefore, it degrades the quality of the SVM model.

SVN: Using Branches

$ svn mkdir http://svn.example.com/repos/calc/branches -m “make the branches directory to hold all the branches”

$ svn copy http://svn.example.com/repos/calc/trunk http://svn.example.com/repos/calc/branches/my-calc-branch -m “Creating a private branch of /calc/trunk.”

$ svn checkout http://svn.example.com/repos/calc/branches/my-calc-branch

Crop pages for PDF

http://help.adobe.com/en_US/Acrobat/9.0/Professional/WS546948FF-6085-4b14-8640-D9EDE30AD8CB.w.html

Crop a page with the Crop tool
1.Choose Tools > Advanced Editing > Crop Tool.
2.Drag a rectangle on the page you want to crop. If necessary, drag the corner handles of the cropping rectangle until the page is the size you want.
3.Double-click inside the cropping rectangle or shift+return.
The Crop Pages dialog box opens, indicating the margin measurements of the cropping rectangle and the page to be cropped. You can override these settings or apply other options by making new selections in the dialog box before clicking OK.

Massimiliano Fatica

Accelerating linpack with CUDA on heterogenous clusters

http://portal.acm.org/citation.cfm?id=1513895.1513901

The author calculates the bandwidth of PCIe and the peak GFlops of a CPU and a GPU. Then calculate the execution time with the measurement and the data input size, and get the optimal split fraction. The author does not overlap the execution with data transfer. Because the memory system cannot supply data to both the PCIe and the CPU at maximum speed on Intel systems using Front Side Bus (FSB). However, on the new Intel systems with Quick Path Interconnect (QPI), this may not be the case.

Amnon Barak

A Package for OpenCL Based Heterogeneous Computing on Clusters with Many GPU Devices

http://www.mosix.org/txt_pub.html

The authors provide a package named Many GPUs Package(MGP). MGP runs OpenCL applications on the cluster consists many nodes that have a GPU or GPUs without modifying OpenCL kernel. The user should port an OpenCL host program to a program that uses MGP API. This paper does not address how to distribute workload into nodes.

Chi-Keung Luk

Qilin: Exploiting parallelism on heterogeneous multiprocessors with adaptive mapping

http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5375318&tag=1

The authors present a technique named Qilin that distributes workload into CPUs and a GPU. Qilin maintains a database that provides execution time projection for all the programs it has ever executed. Qilin projects the execution time for a CPU and a GPU by an empirical approach. Qilin runs a program with training set. The set is divided into two chunk. Each chunk is for a CPU and a GPU. Each chunk is divided into sub chunks again. Qilin run with the sub chunks and measure the execution times for all sub chunks. Qilin uses curve fitting to construct two linear equations for a CPU and a GPU. With this databases, Qilin can predict the optimal well-balanced workload distribution.

Canqun Yang

Adaptive Optimization for Patescale Heterogeneous CPU/GPU Computing

http://www.computer.org/portal/web/csdl/doi/10.1109/CLUSTER.2010.12

The authors present an adaptive optimization for heterogeneous CPU/GPU systems. They distribute workload into CPU and GPU by using results from running a program. They measure workload and execution time of CPU and GPU, then recalculate the fraction of the workload mapped to the CPU and GPU. The runtime saves this information into the table, and use the fraction in next time.