Aussie AI

Chapter 4. CUDA Emulation

Book Excerpt from "CUDA C++ Debugging: Safer GPU Kernel Programming"

by David Spuler

CUDA CPU Emulation

Is it possible to run a CUDA program without a GPU? This is desirable for playing around to learn CUDA, or teaching a class of students about CUDA programming.

There was a CUDA emulator as part of the main toolkit, but it’s since been removed. It’s only supported as far back as the CUDA Toolkit 3.0 version, using a “-deviceemu” option.

Once upon a time there was also a PGI compiler that ran CUDA programs on a CPU. This company was acquired by NVIDIA in 2013, and the PGI compiler has since been merged into the NVIDIA HPC SDK and the PGI name and products were subsequently retired.

However, here’s a solution in the cloud: Google Colab offers a free tier whereby you can run CUDA C++ code on a virtual machine. It’s not really an “emulation” but more like a full GPU for free up in the cloud. You can set up a Linux virtual environment with a real GPU attached and CUDA installed, and it’s free for low-end T4 GPUs (as of this writing). You have to pay for some more advanced capabilities like A100 GPUs, but the low-end tier is fine for learning and experimenting with CUDA. I’ve described how to set that up further below.

CUDA C++ Emulation Library

At Aussie AI, we have implemented a CUDA wrapper library in basic C++ for emulation of a very small subset of CUDA on CPU. This is primarily useful as a learning and teaching tool, but does not support enough CUDA primitives for production usage. Find more details at https://www.aussieai.com/cuda/projects.

The idea is to run basic CUDA C++ code without a GPU, so that they can be tested in non-CUDA platforms like Microsoft Visual C++ on Windows and GCC on Linux. The main advantages:

No GPU needed!
Does not need the CUDA Toolkit installed.
Non-CUDA C++ compiler support.

This library is primarily for educational and basic testing purposes. You can run some simple kernels in a simple C++ environment, and learn some of the basics. The emulation will also detect some common failures in your basic CUDA kernels, as part of the emulation mode on CPU.

Main features. The emulator works by intercepting the CUDA primitives in basic C++, and then calling emulation versions of them. The main capabilities include:

Emulates several basic CUDA primitives (e.g., cudaMalloc, cudaMemcpy)
Runs in standard C++ on Microsoft Visual Studio on Windows and gcc on Linux.
Launches CUDA kernels in emulation mode that runs the threads sequentially (simpler to debug).
Detects various common CUDA primitive kernel errors (e.g., memory errors, double deallocation).
Detects common kernel programming errors (e.g., array bounds violations in threads).

How it works. The basic architecture for the emulation library is:

Source code interception in a basic C++ compiler (i.e., not NVCC).
Preprocessor macro interception of CUDA primitives (e.g., cudaMalloc, cudaFree).
Emulation of these basic CUDA primitives in simplified C++ coded versions.
Preprocessor macro interception of C++ primitives (e.g., malloc, free).
Link-time interception of C++ new and delete operators.
Various error checks performed inside the emulated C++ and CUDA C++ functions.

Limitations. This emulation library is not a production-grade CUDA emulator by any means! Its value is more in the educational domain for learning CUDA basic concepts. Some of the main problems include:

Limited subset of CUDA APIs are intercepted.
Most CUDA library calls are not emulated.
Syntax is not identical (e.g., the <<<...>>> kernel launch syntax must be modified).
Synchronization across threads in CUDA kernels is not properly emulated.
Shared memory usage in threads is not emulated.

This emulation library may be extended or modified. Feel free to use it to learn CUDA with my best wishes on your success.

Running CUDA in Google Colab

An alternative to using CUDA Toolkit on your own machine is to run it in the cloud on someone else’s GPU. Google Colab is a free online environment for running and testing code in a virtual Linux box. It’s not really an “emulation” but it can feel like it. You can test CUDA C++ programs using nvcc compiler and real GPU hardware somewhere underneath the virtual layers. And did I mention: for free!

The steps are basically:

1. Open a new notebook in Google Colab

2. Change the “runtime” to be a GPU (e.g., T4 GPU)

3. Upload a CUDA C++ file to Google Colab (e.g., “test1.cu”)

4. Run the nvcc compiler.

5. Run a.out (the executable)

6. Save your notebook.

More details on each step are given below.

1. Open a new Google Colab virtual notebook. You need to follow these steps:

You’ll need to be signed in to your Google Gmail account, or create a Google account.
Navigate your browser to Google Colab: https://colab.research.google.com/
Click on File > New Notebook

2. Change the Notebook’s Runtime to GPU. The steps in more detail:

Click on Runtime > Change runtime type
Choose a GPU, such as “T4 CPU” (free). Or you can pay more for A100 GPU environment. But you don’t need more than the free one to test simple CUDA C++ code.
Click “Save” to confirm your choice of GPU mode.
Now you have a virtual Linux box which is setup for GPU, including with the CUDA Toolkit installed virtually.
You don’t need to do any steps to install CUDA or nvcc.

3. Upload a CUDA C++ file. The steps to upload your source code file:

Store your CUDA C++ code on your PC in a single file (for simple examples), ready for upload.
Ensure the file suffix is “.cu” or “.cpp” (e.g., test1.cu)
Click on the “Folder” icon in Google Colab (an icon on the LHS vertical panel).
This will expand out a view of your virtual files and folders.
By default, you are probably in the “/content” directory on the virtual Linux filesystem.
Click on the “Upload” icon (top LHS icon, with an up arrow on top of a file icon).
Choose your “test1.cu” file from your local PC drive.
Confirm your upload choice in the file browser (e.g., click “Open” on Windows).
The newly uploaded file should, after a brief delay, appear in the files and folders view on Google Colab.

4. Run nvcc to compile your CUDA C++ file. Here are the steps:

Create a new “+Code” cell in Google Colab.
Edit the new cell to have a command like: !nvcc test1.cu
Note that “!” is required, and means to run the command in a Cell. Also, use lower case letters.
Note that “nvcc” in lower case letters is the command for the NVIDIA C++ Compiler (NVCC).
Click on the “Play” (triangle) button or “run cell” to execute this new cell.
This should run the nvcc CUDA C++ compiler to create your executable file into “a.out”.
Wait for the Cell to finish executing (i.e., wait for the button icon to stop spinning).
After a brief delay, you should see a new file called “a.out” appearing in the Files/Folders view.

Failed compilation. If your CUDA C++ code has a compilation error, nvcc won’t create an executable file, and you’ll get some error messages instead appearing inside the cell’s output area.

If there’s not a new a.out file in the Folder view, nvcc probably failed to compile, because of a syntax error in your CUDA C++ code. Review the warnings from nvcc.
Edit your CUDA C++ source file to fix any errors.
You can edit it in the virtual environment by double clicking on the filename. This opens a text editor in your Google Colab notebook, but note that you’ll lose any changes if your notebook shuts down.
Alternatively, you can re-edit the file on your PC and re-upload the edited file to Google Colab.
Re-run the nvcc cell to compile the newly edited CUDA C++ file and create “a.out”.

5. Run your a.out executable.

Create another “+Code” cell in Google Colab.
Use command: !a.out
Note that “!” means run the command, and “a.out” in lower case letters is the name of the executable.
Click on the “Play” (triangle) button to run the cell.
The output from your CUDA C++ program should appear.
Hooray!

6. Save your notebook (optional). Note that your uploads to Google Colab are not automatically saved. That’s too much to expect for a free service. It will eventually time out, and your uploaded files will also disappear from your notebook folders if you close your browser. If you’ve edited these files inside Google Colab, you lose your changes.

One partial fix is to create backups of your notebook, either on your PC or in Google Drive. There is a “Download” option for your entire notebook. For Google Drive backups, when inside Google Colab, use the “File > Save a copy in Drive” menu. However, this doesn’t seem to save and restore your uploaded files, but only the “notebook” part with all the cells.

A better fix to save all files and also avoid manually backup and restore of your entire notebook is to map Google Drive into your folder hierarchy. The idea is to “mount” your Google Drive files as a subdirectory inside your Colab notebook. Then you can save the files into that folder in Colab, and they’ll then be stored in Google Drive. Example command to run:

    from google.colab import drive
    drive.mount('/content/gdrive')

After this, if you upload or edit files in the “gdrive” folder, then they’re in your Google Drive.

You can upgrade to a paid version to get the capability to store a notebook in your account. Alternatively, you can just repeat the steps each time you navigate to Google Colab, assuming that your CUDA C++ files are being edited on your local box, and not virtually in the notebook.

Troubleshooting Problems on Google Colab

I had a few problems with the CUDA source file getting uploaded to the wrong virtual directory in Google Colab, sometimes ending up in the parent directory (probably user error). This result was this sort of error from nvcc:

    cc1plus: fatal error: test1.cu: No such file or directory
    compilation terminated.

Maybe you’ve used the wrong filename, or maybe it’s in a different subdirectory. You can check where your “test1.cu” file is in the file hierarchy on the LHS by clicking on the Folder icon. To see your current directory where nvcc is running in a Cell, create a new Code cell with “!pwd” command and run it (“pwd” is the Linux command for “print working directory”). You can also run “!ls” (without any quotes) to list the files in the current working directory in your virtual notebook.

If you somehow get nvcc running in “/content” but the “.cu” file in a higher directory, use this command in the cell to get nvcc to find the CUDA file in the parent directory:

    !nvcc ../test1.cu

You might also get this type of error message:

    nvcc fatal   : Don't know what to do with 'test1.cu.txt'

This error is the wrong file suffix given to nvcc (i.e., “.txt” rather than “.cu” here), which is a reminder of the joyful experience of Windows protecting me from things. It’s hard to rename the file suffix in File Explorer from “.txt” to “.cu” and usually I have to resort to the DOS “ren” command in a command shell, but I digress.

No output appeared. Did any output appear?

If absolutely nothing appears from your CUDA “hello world” program (i.e., with printf in the GPU kernel), and there’s no compile errors from nvcc, and no errors or runtime output from a.out, maybe you’ve made a common mistake of not calling cudaDeviceSynchronize, as discussed earlier in the chapter.

At the risk of repeating myself, CUDA kernel launches are asynchronous and main does not wait for your GPU code to finish, unless you force it to. Also, any printf inside a CUDA kernel on the GPU does not ever appear if the CPU code has already exited. The code has run so fast that it all finished before any output got generated properly, so it shows nothing. The solution is to add a call to cudaDeviceSynchronize after the kernel launch, or at the end of main, which forces the CPU to wait for the GPU kernel to finish.

• Online: Table of Contents

• PDF: Free PDF book download

• Buy: CUDA C++ Debugging: Safer GPU Kernel Programming

The new CUDA C++ Debugging book:

Debugging CUDA C++ kernels
Tools & techniques
Self-testing & reliability
Common GPU kernel bugs

Get your copy from Amazon: CUDA C++ Debugging

Aussie AI

Chapter 4. CUDA Emulation

CUDA CPU Emulation

CUDA C++ Emulation Library

Running CUDA in Google Colab

Troubleshooting Problems on Google Colab

Quick Links

Product

New to Writing?

Writing Styles