Game Programming Gurus

Free JavaScript Editor Ajax Editor
↑

Main Page

Multithreaded Programming Techniques

Up to this point, all the demos in this book have used a single threaded event loop and programming model. The event loop reacts to the player's input and renders the game at a rate of 30+ fps. Along with reacting to the player, the game performs millions of operations per second, along with processing dozens if not hundreds of small tasks, such as drawing all the objects, retrieving input, making music, and so on. Figure 11.15 shows the standard game loop that you've been using.

Figure 11.15. Standard DOS single-tasking game loop.

graphics/11fig15.gif

As you can see from Figure 11.15, the game logic performs all the tasks of the game in a serial/sequential manner. Of course, there are exceptions to this, including interrupts that can perform simple logic such as music and input control, but for the most part, a game is one long sequence of function calls that repeat forever.

What makes a game seem fluid and real is the fact that even though everything is performed in sequence, step by step, the computer is so fast that it all seems as if it's happening at once. Hence, the model that most game programmers use is a single tasking execution thread that performs many operations in series to arrive at the desired output for each frame. This is one of the best ways to do things and is a side effect of DOS game programming.

However, the days of DOS are over, so it's time that you start using the multithreaded abilities of Windows 95/98/ME/XP/NT/2000 and, well, liking it!

This section is going to cover the threads of execution under Windows 95/98/NT. These threads allow you to run multiple tasks within the same application with very little drama. Now, before you get started, let's cover a little terminology so that this simple subject isn't alien to you.

Multithreaded Programming Terminology

There are a number of "multi-" words in the computer lexicon that mean various things. Let's begin by talking about multiprocessors and multiprocessing, and then finish up with multithreading.

A multiprocessor computer is one that has more than one processor. The Cray and the Connection Machine are both good examples. The Connection Machine can have up to 64,000 processing cores (a hypercube network), and each one can be executing code.

Back down on Earth, you can purchase a quad processor Pentium III+ machine and run Windows NT on it. These are usually SMP (symmetrical multiprocessing) systems, meaning that all four processors will run tasks symmetrically. Actually, that is not totally true because the OS kernel will only run only one of the processors (sorta), but as far as processes go, they will run equally well on either processor. So the idea of a multiprocessor computer is to have more than one processor to split the workload.

On some systems, only one task or process can run on each processor, while on other systems, such as Windows NT, thousands of tasks can run on each processor. This is basically multiprocessing, the running of multiple tasks on a single- or multiple-processor machine.

The last concept is multithreading, which is the what you're interested in today. A process under Windows 95/98/NT/2000 is really a whole program; although it may or may not run by itself, most of the time it is an application. It can have its own address space and context, and it exists by itself.

A thread, on the other hand, is a much simpler entity. Threads are created by processes and have very little identity of their own. They run in the address space of the process that created them, and they are very simple. The beauty of threads is that they get as much processor time as anything else does, and they exist in the same address space as the parent process that created them.

This means that communicating to and from threads is very simple. In essence, they are exactly what you want as a game programmer: a thread of execution that does something in parallel with your other main program tasks, that you don't have to babysit, and that has access to the variables in your program.

Along with the "multi-" words, there are a few more concepts that you need to know about. First, Windows 95, 98, NT, and 2000 are multitasking/preemptive operating systems. This means that no task, process, or thread can take control of the computer; each one will be preempted at some point and blocked, and the next thread of execution will get to run. This is completely different from Windows 3.1, which was not preemptive. If you didn't call GetMessage(...) each cycle, other processes didn't run. In Windows 95/98/NT/2000, you can sit in a FOR loop forever if you like and do nothing, and the OS will still run the other tasks.

Also, under Win95/98/NT/2000, each process or thread has a priority that dictates how long it gets to run before being preempted. So, if there are 10 threads that all have the same priority, they will all get equal time or be processed in a round-robin fashion. However, if one thread has kernel-level priority, it will of course run for more time in each cycle. Take a look at Figure 11.16 to see this.

Figure 11.16. Round robin thread execution with equal and unequal thread priorities.

graphics/11fig16.gif

Finally, this question arises: "What are the differences between Windows 95/98/NT/ 2000 multithreading?" Well, there are some differences, but for the most part you can use the Windows 95 OS model and be safe on all platforms. It's the lowest common denominator. Although 98 and NT are much more robust, I'll use a Windows 95 machine for most of the examples in this section.

Why Use Threads in a Game?

The answer to this question should be obvious by now. As a matter of fact, I think you could create a list of about 1,000 things that you could do with threads right off the bat. However, if you're just coming down from a Mountain Dew high (or Sobe, which is my new poison), here are some common uses for threads:

Updating animation
Creating ambient sound effects
Controlling small objects
Querying input devices
Updating global data structures
Creating pop-up menus and controls

That last use is one of my favorites. It's always a pain to put up menus and let the user make changes while the game is running, but with threads it's much simpler.

Now, I still haven't answered the question of why you should use threads in a game, as opposed to just making a huge loop and calling functions. Well, threads do the same thing, basically, and when you start creating more and more object-oriented software, at some point you'll come up with structures that are like automatons. These are objects that represent game characters that you want to be able to create and destroy without having logical side effects on the main game loop. This can be accomplished in the coolest way with C++ classes along with threads.

Before you get started creating your first thread, let's make something totally clear here: On a single processor computer, only one thread can execute at a time. So you still get nothing for free, but it seems that way from a software point of view, so just assume you do to make your programming easier and more correct. Figure 11.17 shows an example of a main process and three threads that are executing along with it.

Figure 11.17. Primary process spawning three secondary threads.

graphics/11fig17.gif

The timetable in the figure shows the various threads that have control of the processor, in milliseconds. As you can see, the threads run one at a time, but they can run out of order and for different amounts of time based on their priority.

Enough foreplay. Let's get to the code!

Conjuring a Thread from the Plasma Pool

You'll be using console applications for the examples that follow, so once again, please compile the programs correctly. (I'm only belaboring this because every hour I get 30-60 emails on various books I've written from people using the VC++ compiler wrong. Doesn't anyone read the introductions?)

However, there is one more caveat: You must use the multithreaded libraries for these examples. You do this by going into the main menu in MS DEV Studio, under Project, Settings, then under the C/C++ tab go to Category: Code Generation, and set the Use Run-time Library to multithreaded. This is also shown in Figure 11.18. Also, make sure to turn optimization off. It can confuse the multithreaded synchronization code sometimes, so better safe than sorry.

Figure 11.18. Creating a console application with multithreaded libraries.

graphics/11fig18.gif

NOTE

I just had deja vu. Or was it really deja vu, or just a glitch in the simulation?

If you didn't get that, you won't know what it was you didn't get so it won't matter anyway. :)

All righty then, let's get started. Creating a thread is easy; it's keeping it from destruction that's the hard part! The Win32 API call is as follows:

 HANDLE CreateThread(
LPSECURITY_ATTRIBUTES  lpThreadAttributes,
        // pointer to thread security attributes
  DWORD  dwStackSize,  // initial thread stack size, in bytes
  LPTHREAD_START_ROUTINE  lpStartAddress,
               // pointer to thread function
  LPVOID  lpParameter,      // argument for new thread
  DWORD  dwCreationFlags,   // creation flags
  LPDWORD  lpThreadId );    // pointer to returned thread identifier

lpThreadAttributes points to a SECURITY_ATTRIBUTES structure that specifies the security attributes for the thread. If lpThreadAttributes is NULL, the thread is created with a default security descriptor and the resulting handle is not inherited.

dwStackSize specifies the size, in bytes, of the stack for the new thread. If 0 is specified, the stack size defaults to the same size as that of the primary thread of the process. The stack is allocated automatically in the memory space of the process, and it is freed when the thread terminates. Note that the stack size grows, if necessary.

CreateThread tries to commit the number of bytes specified by dwStackSize, and fails if the size exceeds available memory.

lpStartAddress points to the application-supplied function to be executed by the thread and represents the starting address of the thread. The function accepts a single 32-bit argument and returns a 32-bit exit value.

lpParameter specifies a single 32-bit parameter value passed to the thread.

dwCreationFlags specifies additional flags that control the creation of the thread. If the CREATE_SUSPENDED flag is specified, the thread is created in a suspended state and will not run until the ResumeThread() function is called. If this value is zero, the thread runs immediately after creation.

lpThreadId points to a 32-bit variable that receives the thread identifier.

If the function succeeds, the return value is a handle to the new thread. If the function fails, the return value is NULL. To get extended error information, call GetLastError().

The function call might look a bit complex, but it's really not. It just allows a lot of control. You won't use much of its functionality in most cases.

When you're done with a thread, you need to close its handle; in other words, let the operating system know that you're done using the object. This is done with the CloseHandle() function call, which uses the handle returned by CreateThread() and reduces the reference count in the kernel object that refers to the thread by 1.

You need to do this for every thread when you're done with it. This does not kill the thread; it just tells the OS that the thread is dead. The thread must terminate itself, be told to terminate (with TerminateThread()), or be terminated by the OS when the main thread or primary thread terminates. We'll get to all that later, but for now, just realize that this is a clean-up call that needs to be done before you exit a multithreaded app. Here is the function prototype:

BOOL CloseHandle(HANDLE  hObject );    // handle to object to close

hObject identifies an open object handle. If the function succeeds, the return value is TRUE. If the function fails, the return value is FALSE. To get extended error information, call GetLastError(). Furthermore, CloseHandle() closes handles to the following objects:

Console input or output
Event files
File mappings
Mutexes
Named pipes
Processes
Semaphores
Threads

Basically, CloseHandle() invalidates the specified object handle, decrements the object's handle count, and performs object retention checks. Once the last handle to an object is closed, the object is removed from the operating system.

WARNING

The new thread handle is created with full access to the new thread. If a security descriptor is not provided, the handle can be used in any function that requires a thread object handle. When a security descriptor is provided, an access check is performed on all subsequent uses of the handle before access is granted. If the access check denies access, the requesting process cannot use the handle to gain access to the thread.

Now let's take a look at some code that could represent a thread you would pass for processing to CreateThread():

DWORD WINAPI My_Thread(LPVOID data)
{
// .. do work

// return an exit code at end, whatever is appropriate for your app

return(26);
}  // end My_Thread

Now you have everything you need to create your first multithreaded app. The first example will illustrate the creation of a single thread, along with the primary thread of execution (the main program). The secondary thread will print out the number 2, and the primary thread will print out the number 1. DEMO11_5.CPP contains the complete program and is shown here for reference:

// DEMO11_5.CPP - Creates a single thread that prints
// simultaneously while the Primary thread prints.
// INCLUDES ////////////////////////////////////////////////

#define WIN32_LEAN_AND_MEAN  // make sure win headers
                             // are included correctly

#include <windows.h>    // include the standard windows stuff
#include <windowsx.h>   // include the 32 bit stuff
#include <conio.h>
#include <stdlib.h>
#include <stdarg.h>
#include <stdio.h>
#include <math.h>
#include <io.h>
#include <fcntl.h>

// DEFINES ////////////////////////////////////////////////

// PROTOTYPES //////////////////////////////////////////////

DWORD WINAPI Printer_Thread(LPVOID data);

// GLOBALS /////////////////////////////////////////////////

// FUNCTIONS ///////////////////////////////////////////////

DWORD WINAPI Printer_Thread(LPVOID data)
{
// this thread function simply prints out data
// 25 times with a slight delay

for (int index=0; index<25; index++)
    {
    printf("%d ",data); // output a single character
    Sleep(100);         // sleep a little to slow things down
    }  // end for index


// just return the data sent to the thread function

return((DWORD)data);

} // end Printer_Thread

// MAIN ///////////////////////////////////////////////////////////////

void main(void)
{
HANDLE thread_handle;  // this is the handle to the thread
DWORD  thread_id;      // this is the id of the thread

// start with a blank line
printf("\nStarting threads...\n");
// create the thread, IRL we would check for errors
thread_handle = CreateThread(NULL,        // default security
                  0,                // default stack
                  Printer_Thread,  // use this thread function
                  (LPVOID)1,       // user data sent to thread
                  0,                // creation flags, 0=start now.
                  &thread_id);    // send id back in this var

// now enter into printing loop, make sure this takes longer than thread,
// so thread finishes first
for (int index=0; index<50; index++)
    {
    printf("2 ");
    Sleep(100);
    }  // end for index

// at this point the thread should be dead
CloseHandle(thread_handle);

// end with a blank line
printf("\nAll threads terminated.\n");

}  // end main

Sample output:

Starting threads...
2 1 2 1 2 1 2 1 1 2 2 1 1 2 2 1 1 2 2 1 1 2 2 1 1 2
2 1 1 2 2 1 1 2 2 1 1 2 2 1 1 2 2 1 1 2 2 1 1 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
All threads terminated.

As you can see from the sample output, each thread of execution runs for a short time, and then the OS switches context to the next waiting thread. In this case, the OS simply toggles back and forth between the primary thread and the secondary thread.

Now let's try to create multiple threads. You can make a slight modification to DEMO11_5.CPP to add this functionality. All you need to do is call the CreateThread() function multiple times, once for each thread. Also, the data sent to the thread will be the value to print out each time so you can differentiate each thread from one another.

DEMO11_6.CPP|EXE contains the new modified multithreaded program and is listed here for reference. Notice the use of arrays to hold the thread handles and IDs:

// DEMO11_6.CPP - A new version that creates 3
// secondary threads of execution
// INCLUDES /////////////////////////////////////////////////

#define WIN32_LEAN_AND_MEAN  // make sure certain headers
                             // are included correctly

#include <windows.h>         // include the standard windows stuff
#include <windowsx.h>        // include the 32 bit stuff
#include <conio.h>
#include <stdlib.h>
#include <stdarg.h>
#include <stdio.h>
#include <math.h>
#include <io.h>
#include <fcntl.h>

// DEFINES ///////////////////////////////////////////////////

#define MAX_THREADS 3

// PROTOTYPES ///////////////////////////////////////////////

DWORD WINAPI Printer_Thread(LPVOID data);

// GLOBALS /////////////////////////////////////////////////

// FUNCTIONS //////////////////////////////////////////////

DWORD WINAPI Printer_Thread(LPVOID data)
{
// this thread function simply prints out data
// 25 times with a slight delay
for (int index=0; index<25; index++)
    {
    printf("%d ",(int)data+1); // output a single character
    Sleep(100);                // sleep a little to slow things down
    }  // end for index

// just return the data sent to the thread function
return((DWORD)data);

}  // end Printer_Thread

// MAIN ///////////////////////////////////////////////////////////////

void main(void)
{

HANDLE thread_handle[MAX_THREADS];  // this holds the
                                    // handles to the threads
DWORD  thread_id[MAX_THREADS];      // this holds the ids of the threads

// start with a blank line
printf("\nStarting all threads...\n");

// create the thread, IRL we would check for errors
for (int index=0; index<MAX_THREADS; index++)
    {
    thread_handle[index] = CreateThread(NULL, // default security
                       0,                 // default stack
                      Printer_Thread,  // use this thread function
                       (LPVOID)index, // user data sent to thread
                       0,       // creation flags, 0=start now.
                       &thread_id[index]); // send id back in this var
    }  // end for index

// now enter into printing loop, make sure
// this takes longer than threads,
// so threads finish first, note that primary thread prints 4
for (index=0; index<75; index++)
    {
    printf("4 ");
    Sleep(100);
    }  // end for index

// at this point the threads should all be dead, so close handles
for (index=0; index<MAX_THREADS; index++)
    CloseHandle(thread_handle[index]);

// end with a blank line
printf("\nAll threads terminated.\n");

}  // end main

Sample output:

Starting all threads...
4 1 2 3 4 1 2 3 4 1 2 3 1 4 2 3 4 1 2 3 1 4 2 3 4
1 2 3 1 4 2 3 4 1 2 3 1 4 2 3 4 1 2 3 4 1 2 3 4 1
2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2
3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
All threads terminated.

Wow! Isn't that cool? It's so easy to create multiple threads. Now, if you're astute, you should be a little weary at this point, and you should question the fact that you used the same function each time for the thread callback. The reason why this works correctly is that all the variables in the code are created on the stack, and each thread has its own stack. So it all works out. Take a look at Figure 11.19 to see this.

Figure 11.19. Primary and secondary thread memory and code space allocation.

graphics/11fig19.gif

Figure 11.19 overlooks something that is very important: termination. Both threads terminated on their own, but the primary thread had no control over this. In addition, the primary thread really had no way to tell if the threads were complete and had terminated (that is, if they had returned).

What you need is a way to communicate between threads and check the status of threads from one another or from the primary thread itself. There is a brute-force way to terminate a thread using TerminateThread(), but I suggest that you don't use this.

Sending Messages from Thread to Thread

Let's say that you want the primary thread to have control over the spawned threads that it creates. For example, the primary thread may want to kill all the secondary threads. How can you do this? Well, there are a couple of methods to terminate a thread:

Sending a message to the thread to tell it to terminate itself (the right way).
Simply making a kernel-level call and killing the thread (the wrong way).

Although the wrong way might be needed in some cases, it is not safe because it simply pulls the carpet right from under the thread. If the thread needs to perform any clean-up, it never will. This can create memory and resource leaks, so be careful. Figure 11.20 illustrates the different methods to instruct a thread to terminate.

Figure 11.20. Thread termination methods.

graphics/11fig20.gif

Before you see an example of sending messages to the threads to notify them that they should terminate, take a look at the TerminateThread() function call so you know how to use it if the need arises:

BOOL TerminateThread(HANDLE  hThread,    // handle to the thread
              DWORD  dwExitCode );    // exit code for the thread

hThread identifies the thread to terminate. The handle must have THREAD_TERMINATE access.

dwExitCode specifies the exit code for the thread. Use the GetExitCodeThread() function to retrieve a thread's exit value.

If the function succeeds, the return value is TRUE. If the function fails, the return value is FALSE. To get extended error information, call GetLastError().

TerminateThread() is used to cause a thread to exit. When this occurs, the target thread has no chance to execute any user-mode code and its initial stack is not deallocated. DLLs attached to the thread are not notified that the thread is terminating, and that's a bad thing. :)

To use TerminateThread(), simply call it with the handle to the thread you want to terminate, along with a return code override, and it will be history. Now, don't get me wrong; the function wouldn't exist if there wasn't a use for it. Just make sure that you know what you're doing when you use it and that you've thought of everything.

Let's move on to the message-passing method of terminating a thread. It works by setting a global variable that the secondary threads watch. Then, when the secondary threads see that the global termination flag has been set, they all terminate. But how does the primary thread know when all the secondary threads have terminated? Well, one way to accomplish the task is have another global variable that the threads decrement when they terminate—a reference counter of sorts.

This counter can be tested by the primary thread, and when it's equal to 0, all the secondary threads have terminated and the primary thread can be confident that it's okay to proceed with work and close the handles to the threads. This is almost true… We'll get to the "almost" part after you see a full example of this new message passing system. DEMO11_7.CPP|EXE illustrates global message passing and is shown here:

// DEMO11_7.CPP - An example of global message passing to control
// termination of threads.

// INCLUDES ///////////////////////////////////////////////////////

#define WIN32_LEAN_AND_MEAN  // make sure certain headers
                             // are included correctly

#include <windows.h>         // include the standard windows stuff
#include <windowsx.h>        // include the 32 bit stuff
#include <conio.h>
#include <stdlib.h>
#include <stdarg.h>
#include <stdio.h>
#include <math.h>
#include <io.h>
#include <fcntl.h>

// DEFINES //////////////////////////////////////////////////////

#define MAX_THREADS 3

// PROTOTYPES //////////////////////////////////////////////////

DWORD WINAPI Printer_Thread(LPVOID data);

// GLOBALS /////////////////////////////////////////////////////

int terminate_threads = 0;  // global message flag to terminate
int active_threads    = 0;  // number of active threads

// FUNCTIONS ////////////////////////////////////////////////

DWORD WINAPI Printer_Thread(LPVOID data)
{
// this thread function simply prints out data until it is told to terminate

for(;;)
    {
    printf("%d ",(int)data+1); // output a single character
    Sleep(100);                // sleep a little to slow things down

                              // test for termination message
    if (terminate_threads)
        break;

    }  // end for index

// decrement number of active threads
if (active_threads > 0)
   active_threads--;

// just return the data sent to the thread function
return((DWORD)data);

} // end Printer_Thread

// MAIN //////////////////////////////////////////////////////

void main(void)
{

HANDLE thread_handle[MAX_THREADS];  // this holds the
                                    // handles to the threads
DWORD  thread_id[MAX_THREADS];      // this holds the ids of the threads

// start with a blank line
printf("\nStarting Threads...\n");

// create the thread, IRL we would check for errors
for (int index=0; index < MAX_THREADS; index++)
    {
    thread_handle[index] = CreateThread(NULL, // default security
               0,            // default stack
            Printer_Thread,    // use this thread function
             (LPVOID)index,     // user data sent to thread
            0,            // creation flags, 0=start now.
            &thread_id[index]);// send id back in this var

    // increment number of active threads
    active_threads++;

    }  // end for index

// now enter into printing loop, make sure this
// takes longer than threads,
// so threads finish first, note that primary thread prints 4

for (index=0; index<25; index++)
    {
    printf("4 ");
    Sleep(100);
    }  // end for index

// at this point all the threads are still running,
// now if the keyboard is hit
// then a message will be sent to terminate all the
// threads and this thread
// will wait for all of the threads to message in

while(!kbhit());

// get that char
getch();

// set global termination flag
terminate_threads = 1;

// wait for all threads to terminate,
// when all are terminated active_threads==0
while(active_threads);

// at this point the threads should all be dead, so close handles
for (index=0; index < MAX_THREADS; index++)
    CloseHandle(thread_handle[index]);
// end with a blank line
printf("\nAll threads terminated.\n");

}  // end main

Sample output:

Starting Threads...
4 1 2 3 4 2 1 3 4 3 1 2 4 2 1 3 4 3 1 2 4 2 1 3 4 2
 3 1 4 2 1 3 4 2 3 1 4 2 3 1 4 2 3 1 4 2 3 1 4 2 3 1
 4 2 3 1 4 2 3 1 4 2 3 1 4 2 3 1 4 2 3 1 4 2 3 1 4 2
 3 1 4 2 3 1 4 2 3 1 4 2 3 1 4 2 3 1 4 2 3 1 2 3 1 3 2
 1 1 2 3 3 2 1 1 2 3 3 2 1 1 2 3 3 2 1 1 2 3 3 2 1 1 2
 3 3 2 1 2 3 1 3 2 1 2 3 1 3 2 1 2 3 1 3 2 1 2 3 1 3 2
 1 3 1 2 3 2 1 3 1 2 3 2 1
All threads terminated.

As you can see from the sample output, when the user hits a key, all threads are terminated and the primary thread then terminates. There are two problems with this method. The first problem is subtle. Here's the scenario; read it a couple of times to make sure you see the problem:

Assume that all but one of the secondary threads has terminated.
Assume that the last thread has processor control, and it decrements the global variable that tracks the number of active threads.
At the instant this happens, there is a context switch to the primary process. It tests the global variable and thinks that all the threads have terminated, but the last thread still hasn't returned!

In most cases this is not a problem, but it can be if there's anything between the decrement code and the return code. What you need is a function that can query if a thread is terminated. This would help in many cases. There is a function group that waits for signals, referred to as the Wait*() group, that can help.

The second problem is that you've created what is called a busy loop, or a polling loop. This is normally fine in Win16/DOS, but in Win32 it's a bad thing. Sitting in a tight loop, waiting on a variable, puts a lot of strain on the multitasking kernel and makes the CPU usage shoot way up.

To see this, you can use SYSMON.EXE (part of the Windows 95/98/ME/XP accessories usually), PERFMON.EXE (part of Windows NT/2000), or a similar third-party CPU usage utility. These utilities help you see what is happening with the threads and processor usage. Anyway, let's look at how the Wait*() class of functions can help you determine if a thread has terminated.

Waiting for the Right Moment

Get ready for the most confusing explanation you've ever heard… but it's not my fault, really! Whenever any thread terminates, it becomes signaled to the kernel, and when it is running, it is unsignaled. Whatever that means. And what is the price of plastic zippers tomorrow? You don't care! But what you do care about is how to test for the signaling.

You can test for this event using the Wait*() class of functions, which allow you to test for a single signal (tongue twister) or multiple signals (does that sound sexual to you?). In addition, you can call one of the Wait*() functions to wait for the signal(s) until it happens, but without a busy loop. Much better than polling a global, in most cases. Figure 11.21 illustrates the mechanics of the Wait*() functions and their relationship to the running application and the OS kernel.

Figure 11.21. A timeline of signaling using `Wait*()`.

graphics/11fig21.gif

The two functions that you're going to use are called WaitForSingleObject() and WaitForMultipleObjects(), which are used to wait for a single signal or multiple signals, respectively. Their definitions are

DWORD WaitForSingleObject(HANDLE  hHandle,    // handle of object to wait for
            DWORD  dwMilliseconds );    // time-out interval in milliseconds

hHandle identifies the object.

dwMilliseconds specifies the time-out interval, in milliseconds. The function returns if the interval elapses, even if the object's state is nonsignaled. If dwMilliseconds is zero, the function tests the object's state and returns immediately. If dwMilliseconds is infinite, the function's time-out interval never elapses.

If the function succeeds, the return value indicates the event that caused the function to return. If the function fails, the return value is WAIT_FAILED. To get extended error information, call GetLastError().

The return value on success is one of the following values:

WAIT_ABANDONED— The specified object is a mutex object that was not released by the thread that owned it before the thread terminated. Ownership of the mutex object is granted to the calling thread, and the mutex is set to nonsignaled.
WAIT_OBJECT_0— The state of the specified object is signaled.
WAIT_TIMEOUT— The time-out interval has elapsed, and the object's state is nonsignaled.

Basically, the WaitForSingleObject() function checks the current state of the specified object. If the object's state is nonsignaled, the calling thread enters an efficient wait state. The thread consumes very little processor time while waiting for one of the conditions of the wait to be satisfied. And here is the function used to wait for multiple signals, or in this case multiple threads, to terminate:

DWORD WaitForMultipleObjects(DWORD  nCount, // number of handles
                                            // in handle array
         CONST HANDLE *lpHandles, // address of object-handle array
         BOOL  bWaitAll,      // wait flag
         DWORD  dwMilliseconds );  // time-out interval in milliseconds

nCount specifies the number of object handles in the array pointed to by lpHandles. The maximum number of object handles is MAXIMUM_WAIT_OBJECTS.

lpHandles points to an array of object handles. The array can contain handles of objects of different types. Note for Windows NT: The handles must have SYNCHRONIZE access.

bWaitAll specifies the wait type. If TRUE, the function returns when all objects in the lpHandles array are signaled at the same time. If FALSE, the function returns when any one of the objects is signaled. In the latter case, the return value indicates the object whose state caused the function to return.

dwMilliseconds specifies the time-out interval, in milliseconds. The function returns if the interval elapses, even if the conditions specified by the bWaitAll parameter are not satisfied. If dwMilliseconds is zero, the function tests the states of the specified objects and returns immediately. If dwMilliseconds is infinite, the function's time-out interval never elapses.

WAIT_OBJECT_0 to (WAIT_OBJECT_0 + nCount - 1)— If bWaitAll is TRUE, the return value indicates that the state of all specified objects is signaled. If bWaitAll is FALSE, the return value minus WAIT_OBJECT_0 indicates the lpHandles array index of the object that satisfied the wait. If more than one object became signaled during the call, this is the array index of the signaled object with the smallest index value of all the signaled objects.
WAIT_ABANDONED_0 to (WAIT_ABANDONED_0 + nCount - 1)— If bWaitAll is TRUE, the return value indicates that the state of all specified objects is signaled and at least one of the objects is an abandoned mutex object. If bWaitAll is FALSE, the return value minus WAIT_ABANDONED_0 indicates the lpHandles array index of an abandoned mutex object that satisfied the wait.
WAIT_TIMEOUT— The time-out interval elapsed and the conditions specified by the bWaitAll parameter are not satisfied.

WaitForMultipleObjects() determines whether the conditions exist that satisfy the wait. If the wait is not satisfied, the calling thread enters an efficient wait state, consuming very little processor time, while waiting for one of the conditions of the wait to be satisfied.

Using Signaling to Synchronize Threads

These explanations are very technical. So, as an example of how to use these functions, you're going to make another slight change to the program you've been working with. For the next version, you're going to remove the global termination signal flag and create a main loop that simply calls WaitForSingleObject().

The only reason that you're removing the global terminate message is to make the program simpler. This is still the best way to tell threads to terminate; it's just that sitting in a busy loop is not the best way to test if they've actually terminated.

And that is why you're going to use the WaitForSingleObject() call. This call sits in a virtual wait loop that eats very little processor time. Also, because WaitForSingleObject() can only wait for one signal, and thus one thread, to terminate, this example will only have one secondary thread.

In a moment, you'll rewrite the program to contain three threads, and you'll use WaitForMultipleObjects() to wait for all of them to terminate. Anyway, DEMO11_8.CPP|EXE uses WaitForSingleObject() and creates one extra thread. Take a look at the code:

// DEMO11_8.CPP - A single threaded example of
// WaitForSingleObject(...).

// INCLUDES //////////////////////////////////////////////////////////
#define WIN32_LEAN_AND_MEAN  // make sure certain
                             // headers are included correctly

#include <windows.h>         // include the standard windows stuff
#include <windowsx.h>        // include the 32 bit stuff
#include <conio.h>
#include <stdlib.h>
#include <stdarg.h>
#include <stdio.h>
#include <math.h>
#include <io.h>
#include <fcntl.h>

// DEFINES //////////////////////////////////////////////////////////

// PROTOTYPES //////////////////////////////////////////////////////

DWORD WINAPI Printer_Thread(LPVOID data);

// GLOBALS /////////////////////////////////////////////////////////

// FUNCTIONS //////////////////////////////////////////////////////

DWORD WINAPI Printer_Thread(LPVOID data)
{ // this thread function simply prints out data 50
// times with a slight delay
for (int index=0; index<50; index++)
    {
    printf("%d ",data); // output a single character
    Sleep(100);         // sleep a little to slow things down
    }  // end for index

// just return the data sent to the thread function
return((DWORD)data);

}  // end Printer_Thread

// MAIN ///////////////////////////////////////////////////////

void main(void)
{
HANDLE thread_handle;  // this is the handle to the thread
DWORD  thread_id;      // this is the id of the thread

// start with a blank line
printf("\nStarting threads...\n");

// create the thread, IRL we would check for errors
thread_handle = CreateThread(NULL, // default security
              0,               // default stack
              Printer_Thread,  // use this thread function
              (LPVOID)1,         // user data sent to thread
              0,             // creation flags, 0=start now.
              &thread_id);         // send id back in this var
// now enter into printing loop, make sure
// this is shorter than the thread,
// so thread finishes last
for (int index=0; index<25; index++)
    {
    printf("2 ");
    Sleep(100);
    }  // end for index

// note that this print statement may get
// interspliced with the output of the
// thread, very key!

printf("\nWaiting for thread to terminate\n");

// at this point the secondary thread so still be working,
// now we will wait for it
WaitForSingleObject(thread_handle, INFINITE);

// at this point the thread should be dead
CloseHandle(thread_handle);

// end with a blank line
printf("\nAll threads terminated.\n");

}  // end main

Sample output:

Starting threads...
2 1 2 1 2 1 1 2 2 1 1 2 2 1 1 2 2 1 1 2 2 1
1 2 2 1 1 2 2 1 1 2 2 1 1 2 2 1 1 2 2 1 1 2 2 1 1 2 2 1 1
Waiting for thread to terminate
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
All threads terminated.

The program is very simple. As usual, you create the secondary thread and then, right away, you enter into the printing loop. When it terminates, the WaitForSingleObject() is called. If you had more work to do in the primary thread, you would do it. But in this case you don't, so you just enter into the wait function and wait. If you run the program with SYSMON.EXE active, you'll see that there is almost no processor usage when the wait function is entered, whereas there would be if you used a busy loop.

Before moving on to the next example and multiple threads, there is a little trick you can do with WaitForSingleObject(). Let's say that you want to know the status of a thread at this moment, but you don't want to wait for it to terminate. This can be done by making a NULL call to WaitForSingleObject(), shown here:

//...code

DWORD state = WaitForSingleObject(thread_handle, 0);  // get the status
// test the status
if (state==WAIT_OBJECT_0) { // thread is signaled, i.e. terminated }
else
   if (state==WAIT_TIMEOUT) { // thread is still running }

//...code

Simple enough. This is a great way to test if a particular thread has terminated. This, coupled with the global termination message, is a very robust method to terminate a thread and check if it was actually terminated in a real-time loop when you don't want to wait for the termination until it happens.

Waiting for Multiple Objects

You're almost done. The last Wait*() class function waits on multiple objects or threads to signal. Let's make a program that uses this function. All you need to do is create an array of threads and then pass the array of handles to WaitForMultipleObjects(), along with a couple of parameters.

When the function returns, if all went well, all the threads will have terminated. DEMO11_9.CPP|EXE is similar to DEMO11_8.CPP|EXE, except that it creates multiple threads and then the primary thread waits for all of them to terminate. Again, you don't use a global termination flag because you already know how to. Each secondary thread simply runs a few cycles and then terminates. The source for DEMO11_9.CPP is listed here for your review:

// DEMO11_9.CPP -An example use of
// WaitForMultipleObjects(...)

// INCLUDES ///////////////////////////////////////////////////

#define WIN32_LEAN_AND_MEAN  // make sure certain headers
// are included correctly

#include <windows.h>         // include the standard windows stuff
#include <windowsx.h>        // include the 32 bit stuff
#include <conio.h>
#include <stdlib.h>
#include <stdarg.h>
#include <stdio.h>
#include <math.h>
#include <io.h>
#include <fcntl.h>

// DEFINES ///////////////////////////////////////////////////

#define MAX_THREADS 3

// PROTOTYPES /////////////////////////////////////////////////

DWORD WINAPI Printer_Thread(LPVOID data);
// GLOBALS ////////////////////////////////////////////////////

// FUNCTIONS //////////////////////////////////////////////////

DWORD WINAPI Printer_Thread(LPVOID data)
{
// this thread function simply prints out data 50
// times with a slight delay
for (int index=0; index<50; index++)
    {
    printf("%d ",(int)data+1); // output a single character
    Sleep(100);               // sleep a little to slow things down
    }  // end for index

// just return the data sent to the thread function
return((DWORD)data);

}  // end Printer_Thread

// MAIN ////////////////////////////////////////////////////////

void main(void)
{
HANDLE thread_handle[MAX_THREADS]; // this holds the
                                   // handles to the threads
DWORD  thread_id[MAX_THREADS];      // this holds the ids of the threads

// start with a blank line
printf("\nStarting all threads...\n");

// create the thread, IRL we would check for errors
for (int index=0; index<MAX_THREADS; index++)
    {
    thread_handle[index] = CreateThread(NULL, // default security
                            0,        // default stack
                Printer_Thread,// use this thread function
                (LPVOID)index, // user data sent to thread
                0,        // creation flags, 0=start now.
                &thread_id[index]);    // send id back in this var
    }  // end for index

// now enter into printing loop,
// make sure this takes less time than the threads
// so it finishes first
for (index=0; index<25; index++)
    {
    printf("4 ");
    Sleep(100);
    }  // end for index

// now wait for all the threads to signal termination
WaitForMultipleObjects(MAX_THREADS, // number of threads to wait for
                   thread_handle,  // handles to threads
                   TRUE,           // wait for all?
                   INFINITE);      // time to wait,INFINITE = forever

// at this point the threads should all be dead, so close handles
for (index=0; index<MAX_THREADS; index++)
    CloseHandle(thread_handle[index]);

// end with a blank line
printf("\nAll threads terminated.\n");

}  // end main

Sample output:

Starting all threads...
4 1 2 3 4 1 2 3 1 4 2 3 2 4 1 3 1 4 2 3 2 4 1 3
1 4 2 3 2 4 1 3 1 4 2 3 2 4 1 3 1 4 2 3 2 4 1 3
1 4 2 3 2 4 1 3 1 4 2 3 2 4 1 3 1 4 2 3 2 4 1 3
1 4 2 3 2 4 1 3 1 4 2 3 2 4 1 3 1 4 2 3 2 4 1 3
1 4 2 3 2 1 3 2 1 3 2 1 3 2 1 3 2 1 3 2 1 3 2 1
3 2 1 3 2 1 3 2 1 3 2 1 3 2 1 3 2 1 3 2 1 3 2 1
3 2 1 3 2 1 3 2 1 3 2 1 3 2 1 3 2 1 3 2 1 3 2 1 3 2 1 3 2 1 3
All threads terminated.

The sample output is what you'd expect. All threads print along with the primary thread (which prints 4's for a bit), but when the loop in the primary thread is complete, the secondary threads continue until they all finish. Once all the threads terminate, the primary thread terminates because it is blocked from termination via the WaitForMultipleObjects().

Multithreading and DirectX

Now you know something about multithreading. The next question is how you can really use it in game programming and DirectX programs. Just do it—that's all there is to it. Of course, you must make sure to use the multithreaded libraries rather than the single-threaded libraries when compiling. In addition, there are a lot of "critical section" problems that might arise when you're mucking with DirectX resources.

Make sure that you use a global strategy for resources so that if more than one thread accesses a resource, nothing will blow up. For example, let's say that one thread locks a surface, and then another thread executes and tries to lock the same surface. This will cause a problem. These kinds of problems can be solved using sempahores, mutexes, and critical sections. I don't have time to cover any of these, but you can always pick up a good book on the subject, like Multithreading Applications in Win32 by Jim Beveridge and Robert Weiner, published by Addison Wesley. This is the best book I've seen on the topic.

To implement these types of resource management applications and to share threads properly, you simply create variables that track if another thread is using the resource. Then, any thread that wants a resource that other thread might be using tests this variable before mucking with it. Of course, this can also be a problem unless the variable can be tested and changed atomically, because you could be halfway through changing a variable and another thread could gain control.

You can minimize this by making these variables of the type volatile, which tells the compiler not to make memory copies, for one thing. However, in the end you'll have to use semaphores (a simple counter like the global variable, but implemented with atomic code in assembly that can't be interrupted ), mutexes (allows only one thread to enter a critical section; a binary semaphore), critical sections (sections that you indicate to the compiler with Win32 calls that are only supposed to allow one thread at a time), and so forth, so read up on them. On the other hand, if each thread is fairly independent in what it does, you won't have to worry about this stuff as much.

For an example of a DirectX application that uses threads, check out DEMO11_10.CPP|EXE. (16-bit version, DEMO11_10_16B.CPP|EXE). It creates a number of alien BOBs (blitter objects) and moves them around in the mainline. Mutlithreading is used to animate the colors of the BOBs for the 8-bit version and move the BOBs in the 16-bit version. This is a very simple and safe example of multithreading. Make sure to link with all the DirectX .LIB files.

However, if you had many threads all calling the same functions, the problem of reentrancy would come into play. Functions that are reentrant need to have state information and can't use globals that can be corrupted by preemptive threads coming in and out of the code.

In addition, if you use threads to animate the DirectX blitter objects themselves, surface contingency, timing, and synchronization will really wreak havoc on your code. I suggest restricting the use of threads to processes that are for the most part independent of others, exist in their own "state space," and don't have to run at a precise rate.

Advanced Multithreading

Well, this is a good place to stop because the next set of topics has to do with race conditions, deadlocks, critical sections, mutexes, semaphores, and really big headaches. All of these things (except the last one) help you write multithreaded programs that don't step on each other. However, even without knowing about them, you can still accomplish a lot of safe multithreaded programming just by using common sense and remembering that any thread can interrupt any other thread. Just be careful with how your threads access shared data structures.

Try to do everything as automatically as possible. Make sure that one thread doesn't alter a variable and then another thread uses this half-altered version! Also, there were a few function calls left out of this chapter that are fairly basic, such as ExitThread() and GetThreadExitCode(), but they're fairly simple to understand and you can look them up in your favorite API bible.

→

Ajax Editor JavaScript Editor

Multithreaded Programming Techniques

Figure 11.15. Standard DOS single-tasking game loop.

Multithreaded Programming Terminology

Figure 11.16. Round robin thread execution with equal and unequal thread priorities.

Why Use Threads in a Game?

Figure 11.17. Primary process spawning three secondary threads.

Conjuring a Thread from the Plasma Pool

Figure 11.18. Creating a console application with multithreaded libraries.

Figure 11.19. Primary and secondary thread memory and code space allocation.

Sending Messages from Thread to Thread

Figure 11.20. Thread termination methods.

Waiting for the Right Moment

Figure 11.21. A timeline of signaling using Wait*().

Using Signaling to Synchronize Threads

Waiting for Multiple Objects

Multithreading and DirectX

Advanced Multithreading

Figure 11.21. A timeline of signaling using `Wait*()`.