On Fri, Sep 10, 2010 at 1:06 PM, Alex G &lt;<a href="http://mr.nuke.me">mr.nuke.me</a>@<a href="http://gmail.com">gmail.com</a>&gt; wrote:<br>&gt; -----BEGIN PGP SIGNED MESSAGE-----<br>&gt; Hash: SHA1<br>&gt;<br>&gt; I have a project that I&#39;m converting from plain ole&#39; makefile to cmake.<br>


&gt; I need to have the build result be .ptx and .cubin (both, as the app<br>&gt; attempts to read the .cubin first, and if it&#39;s incompatible with the GPU<br>&gt; architecture, assemble the .ptx file).<br>&gt;<br>&gt; The way I did it with the ole&#39; makefile approach was to invoke nvcc<br>


&gt; - --ptx on File1.cu and File2.cu, and then invoke nvcc --cubin on the<br>&gt; resulting File1.ptx, and File2.ptx. The latter step created File1.cubin,<br>&gt; and File2.cubin.<br>&gt;<br>&gt; I&#39;ve looked over the documentation of FindCUDA, and it seemed that<br>


&gt; CUDA_COMPILE and CUDA_COMPILE_PTX might be my best bets; however, I<br>&gt; can&#39;t figure out how to set targets to be .ptx, or .cubin files.<br>&gt; CUDA_ADD_LIBRARY wants to generate an (empty) object library, which only<br>


&gt; adds headaches, as the project compilation fails when gcc is invoked to<br>&gt; create these libs. This is both redundant and annoying.<br>&gt;<br>&gt; I only need the .ptx and .cubin files. I would greatly appreciate a hint<br>


&gt; or pointers.<br>&gt;<br>&gt; Alex<br><br>Alex, thanks for your interest.  There is an option called CUDA_BUILD_CUBIN, which builds the cubin along with the OBJ file, but it appears to be disabled for PTX compilation.  I&#39;m not exactly sure why, and I even wrote it!  <br>


<br>I&#39;m in the process of adding a new target type to the CUDA_WRAP_SRCS macro 

which currently only supports OBJ and PTX to support CUBINs.  In the mean time, you have two options:<br><br>If you want to modify FindCUDA.cmake, you can edit the following lines (around line 1049 - depending on your version):<br>


<br><span style="font-family: courier new,monospace;">      set(build_cubin OFF)</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">      if ( NOT CUDA_BUILD_EMULATION AND CUDA_BUILD_CUBIN )</span><br style="font-family: courier new,monospace;">


<span style="font-family: courier new,monospace;">         # comment out this line </span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">         # if ( NOT compile_to_ptx )</span><br style="font-family: courier new,monospace;">


<span style="font-family: courier new,monospace;">           set ( build_cubin ON )</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> </span><font style="font-family: courier new,monospace;" face="courier new,monospace">        #</font>  <font face="courier new,monospace">and this line</font><br style="font-family: courier new,monospace;">


<span style="font-family: courier new,monospace;">         # endif( NOT compile_to_ptx )</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">      endif( NOT CUDA_BUILD_EMULATION AND CUDA_BUILD_CUBIN )</span><br style="font-family: courier new,monospace;">


<br>If you can&#39;t edit your FindCUDA.cmake script, you can add a new compile step that generates the CUBIN from your PTX.<br><br><span style="font-family: courier new,monospace;"># Compile the CUDA code to PTX. &lt;my_target&gt; is just a string used to set either the shared library flag &lt;my_target&gt;_EXPORTS and the generated file names&#39; prefixes.</span><br style="font-family: courier new,monospace;">


<span style="font-family: courier new,monospace;">CUDA_WRAP_SRCS(my_target PTX generated_ptx_files <a href="http://myfile.cu">myfile.cu</a> <a href="http://myfile2.cu">myfile2.cu</a> <a href="http://myfile3.cu">myfile3.cu</a>)</span><br style="font-family: courier new,monospace;">


<span style="font-family: courier new,monospace;"><br># FindCUDA doesn&#39;t look for ptxas, but you can do it yourself:</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">find_program(CUDA_PTXAS_EXECUTABLE NAMES ptxas PATHS &quot;${CUDA_TOOLKIT_ROOT_DIR}&quot;)</span><br style="font-family: courier new,monospace;">


<br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"># Now set up the build rules to compile the PTX to CUBINs.</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">set(generated_cubin_files)</span><br style="font-family: courier new,monospace;">


<span style="font-family: courier new,monospace;">foreach(ptx_file ${generated_ptx_files})</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">  # You can get creative and use things like get_filename_component() to strip off the ptx from the filename.</span><br style="font-family: courier new,monospace;">


<span style="font-family: courier new,monospace;">  set(generated_file &quot;${ptx_file}.cubin&quot;)</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">  add_custom_command(</span><br style="font-family: courier new,monospace;">


<span style="font-family: courier new,monospace;">    OUTPUT ${generated_file}</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">    # These output files depend on the source_file and the contents of cmake_dependency_file</span><br style="font-family: courier new,monospace;">


<span style="font-family: courier new,monospace;">    MAIN_DEPENDENCY &quot;${ptx_file}&quot;</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">    # Here&#39;s the ptxas command</span><br style="font-family: courier new,monospace;">


<span style="font-family: courier new,monospace;">    COMMAND ${ptxas} &quot;${ptx_file}&quot; -o &quot;${generated_file}&quot;</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">    COMMENT &quot;Generating ${generated_file}&quot;</span><br style="font-family: courier new,monospace;">


<span style="font-family: courier new,monospace;">    )</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">  </span> <font face="courier new,monospace">list(APPEND generated_cubin_files &quot;${generated_file}&quot;)</font><br style="font-family: courier new,monospace;">


<span style="font-family: courier new,monospace;">endforeach()<br><br><font face="arial,helvetica,sans-serif">Then make sure that you add your CUDA files, ptx files, and generated_cubin_files to your library or executable target.<br>


<br>If you have the PTX, why do you need to generate CUBINs at all?  You should be able to let the driver compile the CUBIN and be ready to go no matter what device you have created.<br><br>James<br></font></span>