THRUST QUICK START GUIDE
DU-06716-001_v6.0 | February 2014
TABLE OF CONTENTS
Chapter 1. Introduction.........................................................................................1
1.1. Installation and Versioning.............................................................................. 1
Chapter 2. Vectors............................................................................................... 2
2.1. Thrust Namespace........................................................................................ 4
2.2. Iterators and Static Dispatching........................................................................4
Chapter 3. Algorithms........................................................................................... 7
3.1. Transformations........................................................................................... 7
3.2. Reductions................................................................................................ 10
3.3. Prefix-Sums............................................................................................... 12
3.4. Reordering................................................................................................ 12
3.5. Sorting.....................................................................................................13
Chapter 4. Fancy Iterators....................................................................................14
4.1. constant_iterator........................................................................................ 14
4.2. counting_iterator........................................................................................ 15
4.3. transform_iterator.......................................................................................15
4.4. permutation_iterator................................................................................... 16
4.5. zip_iterator............................................................................................... 17
Chapter 5. Additional Resources............................................................................ 18
www.nvidia.com
Thrust Quick Start Guide
DU-06716-001_v6.0 | ii
Chapter 1.
INTRODUCTION
Thrust is a C++ template library for CUDA based on the Standard Template Library
(STL). Thrust allows you to implement high performance parallel applications with
minimal programming effort through a high-level interface that is fully interoperable
with CUDA C.
Thrust provides a rich collection of data parallel primitives such as scan, sort, and
reduce, which can be composed together to implement complex algorithms with
concise, readable source code. By describing your computation in terms of these high-
level abstractions you provide Thrust with the freedom to select the most efficient
implementation automatically. As a result, Thrust can be utilized in rapid prototyping
of CUDA applications, where programmer productivity matters most, as well as in
production, where robustness and absolute performance are crucial.
This document describes how to develop CUDA applications with Thrust. The tutorial is
intended to be accessible, even if you have limited C++ or CUDA experience.
1.1. Installation and Versioning
Installing the CUDA Toolkit will copy Thrust header files to the standard CUDA include
directory for your system. Since Thrust is a template library of header files, no further
installation is necessary to start using Thrust.
In addition, new versions of Thrust continue to be available online through the GitHub
Thrust project page. The version of Thrust included in this version of the CUDA Toolkit
corresponds to version 1.7.0 from the Thrust project page.
www.nvidia.com
Thrust Quick Start Guide
DU-06716-001_v6.0 | 1
Chapter 2.
VECTORS
Thrust provides two vector containers, host_vector and device_vector. As the
names suggest, host_vector is stored in host memory while device_vector lives
in GPU device memory. Thrust’s vector containers are just like std::vector in the C+
+ STL. Like std::vector, host_vector and device_vector are generic containers
www.nvidia.com
Thrust Quick Start Guide
DU-06716-001_v6.0 | 2
Vectors
(able to store any data type) that can be resized dynamically. The following source code
illustrates the use of Thrust’s vector containers.
#include
#include
#include
int main(void)
{
// H has storage for 4 integers
thrust::host_vector H(4);
// initialize individual elements
H[0] = 14;
H[1] = 20;
H[2] = 38;
H[3] = 46;
// H.size() returns the size of vector H
std::cout << "H has size " << H.size() << std::endl;
// print contents of H
for(int i = 0; i < H.size(); i++)
std::cout << "H[" << i << "] = " << H[i] << std::endl;
// resize H
H.resize(2);
std::cout << "H now has size " << H.size() << std::endl;
// Copy host_vector H to device_vector D
thrust::device_vector D = H;
// elements of D can be modified
D[0] = 99;
D[1] = 88;
// print contents of D
for(int i = 0; i < D.size(); i++)
std::cout << "D[" << i << "] = " << D[i] << std::endl;
// H and D are automatically deleted when the function returns
return 0;
}
As this example shows, the = operator can be used to copy a host_vector to a
device_vector (or vice-versa). The = operator can also be used to copy host_vector
to host_vector or device_vector to device_vector. Also note that individual
elements of a device_vector can be accessed using the standard bracket notation.
However, because each of these accesses requires a call to cudaMemcpy, they should be
used sparingly. We’ll look at some more efficient techniques later.
www.nvidia.com
Thrust Quick Start Guide
DU-06716-001_v6.0 | 3
Vectors
It’s often useful to initialize all the elements of a vector to a specific value, or to copy only
a certain set of values from one vector to another. Thrust provides a few ways to do these
kinds of operations.
#include
#include
#include
#include
#include
#include
int main(void)
{
// initialize all ten integers of a device_vector to 1
thrust::device_vector D(10, 1);
// set the first seven elements of a vector to 9
thrust::fill(D.begin(), D.begin() + 7, 9);
// initialize a host_vector with the first five elements of D
thrust::host_vector H(D.begin(), D.begin() + 5);
// set the elements of H to 0, 1, 2, 3, ...
thrust::sequence(H.begin(), H.end());
// copy all of H back to the beginning of D
thrust::copy(H.begin(), H.end(), D.begin());
// print D
for(int i = 0; i < D.size(); i++)
std::cout << "D[" << i << "] = " << D[i] << std::endl;
return 0;
}
Here we’ve illustrated use of the fill, copy, and sequence functions. The copy
function can be used to copy a range of host or device elements to another host or
device vector. Like the corresponding STL function, thrust::fill simply sets a range
of elements to a specific value. Thrust’s sequence function can be used to a create a
sequence of equally spaced values.
2.1. Thrust Namespace
You’ll notice that we use things like thrust::host_vector or thrust::copy in our
examples. The thrust:: part tells the C++ compiler that we want to look inside the
thrust namespace for a specific function or class. Namespaces are a nice way to avoid
name collisions. For instance, thrust::copy is different from std::copy provided in
the STL. C++ namespaces allow us to distinguish between these two copy functions.
2.2. Iterators and Static Dispatching
In this section we used expressions like H.begin() and H.end() or offsets like
D.begin() + 7. The result of begin() and end() is called an iterator in C++. In the
case of vector containers, which are really just arrays, iterators can be thought of as
www.nvidia.com
Thrust Quick Start Guide
DU-06716-001_v6.0 | 4
Vectors
pointers to array elements. Therefore, H.begin() is an iterator that points to the first
element of the array stored inside the H vector. Similarly, H.end() points to the element
one past the last element of the H vector.
Although vector iterators are similar to pointers they carry more information with
them. Notice that we did not have to tell thrust::fill that it was operating on
a device_vector iterator. This information is captured in the type of the iterator
returned by D.begin() which is different than the type returned by H.begin(). When
a Thrust function is called, it inspects the type of the iterator to determine whether to use
a host or a device implementation. This process is known as static dispatching since the
host/device dispatch is resolved at compile time. Note that this implies that there is no
runtime overhead to the dispatch process.
You may wonder what happens when a “raw” pointer is used as an argument to a
Thrust function. Like the STL, Thrust permits this usage and it will dispatch the host
path of the algorithm. If the pointer in question is in fact a pointer to device memory
then you’ll need to wrap it with thrust::device_ptr before calling the function. For
example:
size_t N = 10;
// raw pointer to device memory
int * raw_ptr;
cudaMalloc((void **) &raw_ptr, N * sizeof(int));
// wrap raw pointer with a device_ptr
thrust::device_ptr dev_ptr(raw_ptr);
// use device_ptr in thrust algorithms
thrust::fill(dev_ptr, dev_ptr + N, (int) 0);
To extract a raw pointer from a device_ptr the raw_pointer_cast should be applied
as follows:
size_t N = 10;
// create a device_ptr
thrust::device_ptr dev_ptr = thrust::device_malloc(N);
// extract raw pointer from device_ptr
int * raw_ptr = thrust::raw_pointer_cast(dev_ptr);
Another reason to distinguish between iterators and pointers is that iterators can be used
to traverse many kinds of data structures. For example, the STL provides a linked list
container (std::list) that provides bidirectional (but not random access) iterators.
www.nvidia.com
Thrust Quick Start Guide
DU-06716-001_v6.0 | 5
Vectors
Although Thrust does not provide device implementations of such containers, it is
compatible with them.
#include
#include
#include
#include
int main(void)
{
// create an STL list with 4 values
std::list stl_list;
stl_list.push_back(10);
stl_list.push_back(20);
stl_list.push_back(30);
stl_list.push_back(40);
// initialize a device_vector with the list
thrust::device_vector D(stl_list.begin(), stl_list.end());
// copy a device_vector into an STL vector
std::vector stl_vector(D.size());
thrust::copy(D.begin(), D.end(), stl_vector.begin());
return 0;
}
For Future Reference: The iterators we’ve covered so far are useful, but fairly basic. In
addition to these normal iterators, Thrust also provides a collection of fancy iterators with
names like counting_iterator and zip_iterator. While they look and feel like
normal iterators, fancy iterators are capable of more exciting things. We’ll revisit this
topic later in the tutorial.
www.nvidia.com
Thrust Quick Start Guide
DU-06716-001_v6.0 | 6