------------------------------------------------------------------------------
NVIDIA CUDA Profiler Tools Interface (CUPTI)
Release Notes
CUDA Toolkit 4.1
------------------------------------------------------------------------------

FILES IN THE RELEASE:
--------------------
* <cupti_dir>/include  : Contains CUPTI header files

* <cupti_dir>/lib*     : Contains CUPTI library

* <cupti_dir>/sample   : Contains samples showing use of the CUPTI APIs

* <cupti_dir>/doc      : Contains the CUPTI release notes and User's Guide.


SUPPORTED DISTRIBUTIONS
-----------------------
CUPTI is supported on all platforms for which CUDA Toolkit is supported.


SYSTEM REQUIREMENTS
-------------------
. CUDA-enabled GPU
  See http://www.nvidia.com/object/cuda_learn_products.html

. NVIDIA Display Driver

. NVIDIA CUDA Toolkit


INSTALLATION AND SETUP
---------------------
1) Install the NVIDIA display driver

2) Install the NVIDIA CUDA Toolkit

This will install CUPTI into <cuda_dir>/extras/CUPTI (<cuda_dir>
is specified during Toolkit install).


COMPILING AND RUNNING CUPTI SAMPLES
----------------------------------- 
On Windows, the compiling and running CUPTI samples using the included
Makefiles requires the Cygwin environment.

To compile:
 > cd <cupti_dir>/sample/<sample>
 > make

To run the sample:
 > make run


INCOMPATIBLE CHANGES FROM CUPTI 4.0
-----------------------------------
A number of non-backward compatible API changes are made in 4.1. These
changes require minor source modifications to existing code compiled
against CUPTI 4.0. In addition, some previously incorrect and
undefined behavior is now prevented by improved error checking. Your
code may need to be modified to handle these new error cases.

- Multiple CUPTI subscribers are not allowed. In 4.0, cuptiSubscribe()
  could be used to enable multiple subscriber callback functions to be
  active at the same time. When multiple callback functions were
  subscribed, invocation of those callbacks did not respect the domain
  registration for those callback functions. In 4.1, cuptiSubscribe()
  returns CUPTI_ERROR_MAX_LIMIT_REACHED if there is already an active
  subscriber.

- The CUpti_EventID values for tesla devices have changed in 4.1 to
  make all CUpti_EventID values unique across all devices. Going
  forward CUpti_EventID values will be added for new devices and
  events, but existing values will not be changed. If your application
  has stored CUpti_EventID values (for example, as part of the data
  collected for a profiling session), those CUpti_EventIDs must be
  translated to the new ID values before being used in 4.1 APIs.

- In enumeration CUpti_EventDomainAttribute,
  CUPTI_EVENT_DOMAIN_MAX_EVENTS has been removed. The number of events
  in an event domain can be retrieved with
  cuptiEventDomainGetNumEvents().

- cuptiDeviceGetAttribute(), cuptiEventGroupGetAttribute() and
  cuptiEventGroupSetAttribute() now take a size parameter and the
  'value' parameter now has type 'void *'.

- cuptiEventDomainGetAttribute() no longer takes a CUdevice
  parameter. This function is now used to get event domain attributes
  that are device independent. A new function
  cuptiDeviceGetEventDomainAttribute() is added to get event domain
  attributes that are device dependent.

- cuptiEventDomainGetNumEvents(), cuptiEventDomainEnumEvents() and
  cuptiEventGetAttribute() no longer take a CUdevice parameter.

- The contextUid field of the CUpti_CallbackData structure has been
  changed from type uint64_t to type uint32_t.


KNOWN ISSUES
------------

- CUPTI activity record collection must be initialized before any CUDA
  function is invoked. If not, activity collection may be incomplete
  or entirely disabled. Make sure that some CUPTI activity API (such
  as cuptiActivityEnable()) is called before the first CUDA driver or
  runtime function.
