Working with High-Color ModesHigh-color modes (modes that require more than eight bits per pixel) are of course more visually pleasing to the eye than the 256-color modes. However, they usually aren't used in software-based 3D engines for a number of reasons. The biggest reasons are as follows:
However, today, computers are sufficiently fast that you can do 16-bit and even 24-bit software engines and they are fast enough (not nearly as fast as hardware of course). So just something to think about if you are making simpler games that target a large audience, 8-bit is also easier to understand for beginners to program. Working with high-color modes is conceptually similar to working with palletized modes, with the single caveat that you aren't writing color indices into the frame buffer, but instead full RGB-encoded pixel values. This means that you must know how to create an RGB pixel encoding for the high-color modes that you want to work with. Figure 7.1 depicts a number of various 16-bit pixel encodings. Figure 7.1. 16-Bit RGB pixel encodings.16-Bit High-Color ModeReferring to Figure 7.1, there are a number of possible bit encodings for 16-bit modes: Alpha.5.5.5— This mode uses a single bit at position D15 to represent a possible Alpha component (transparency), and the remaining 15 bits are equally distributed with five bits for red, five bits for green, and five bits for blue. This makes a total of 25 = 32 shades for each color and a palette of 32x32x32 = 32,768 colors. X.5.5.5— This mode is similar to the Alpha.5.5.5 mode, except the MSB (most significant bit) is unused and can be anything. The color range is still 32 shades of each primary color (red, green, and blue), with a total of 32x32x32 = 32,768 colors. 5.6.5— This is the most common mode and uses all 16 bits of the WORD to define the color. The format is, of course, five bits for red, six bits for green, and five bits for blue, for a total of 32x64x32 = 65536 color. Now, you may ask, "Why six bits for green?" Well, my little leprechaun, the answer is that human eyes are more sensitive to green, and therefore the increased range for green is the most logical choice of the three primaries. Now that you know the RGB bit-encoding formats, the question is how to build them up. You accomplish this task with simple bit shifting and masking operations, as shown in the following macros: // this builds a 16 bit color value in 5.5.5 format (1-bit alpha mode) #define _RGB16BIT555(r,g,b) ((b & 31) + ((g & 31) << 5) + ((r & 31) << 10)) // this builds a 16 bit color value in 5.6.5 format (green dominate mode) #define _RGB16BIT565(r,g,b) ((b & 31) + ((g & 63) << 5) + ((r & 31) << 11)) You'll notice from the macros and Figure 7.2 that the red bits are located in the high-order bits of the color WORD, the green bits are in the middle bits, and the blue bits are located in the low-order bits of the color WORD. This may seem backwards because PCs are little-endian and place data in low-to-high order, but in this case the bits are in big-endian format, which is much better because they follow RGB order from MSB to LSB. Figure 7.2. Color WORDs are big-endian.WARNING Before you build a quick demo of 16-bit mode, there's one more little detail that I must address—how on Earth do you detect if the video mode is 5.5.5 or 5.6.5? This is important because it's not under your control. You can tell DirectDraw to create a 16-bit mode, but the bit encoding is up to the hardware. You must know this detail because the green channel will be all jacked up if you don't take it into consideration! What you need to know is the pixel format. Getting the Pixel FormatTo figure out the pixel format of any surface, all you need to do is call the function IDIRECTDRAWSURFACE7:GetPixelFormat(), shown here: HRESULT GetPixelFormat(LPDDPIXELFORMAT lpDDPixelFormat); You already saw the DDPIXELFORMAT structure in the previous chapter, but the fields you're interested in are DWORD dwSize; // the size of the structure, must be set by you DWORD dwFlags; // flags describing the surface, refer to Table 7.1 DWORD dwRGBBitCount; // number of bits for Red, Green, and Blue The dwSize field must be set before you make the call to the size of a DDPIXELFORMAT structure. After the call, both the dwFlags field and the dwRGBBitCount fields will be valid and contain the informational flags, along with the number of RGB bits for the surface in question. Table 7.1 lists a subset of the possible flags contained in dwFlags. Note that there are a lot more flags especially for D3D-related properties. Please refer to the DirectX SDK for more information. The fields that matter the most right now are DDPF_PALETTEINDEXED8— This indicates that the surface is an 8-bit palettized mode. DDPF_RGB— This indicates that the surface is an RGB mode and the format can be queried by testing the value in dwRGBBitCount. So all you need to do is write a test that looks something like this: DDPIXELFORMAT ddpixel; // used to hold info LPDIRECTDRAWSURFACE7 lpdds_primary; // assume this is valid // clear our structure memset(&ddpixel, 0, sizeof(ddpixel)); // set length ddpixel.dwSize = sizeof(ddpixel); // make call off surface (assume primary this time) lpdds_primary->GetPixelFormat(&ddpixel); // now perform tests // check if this is an RGB mode or palettized if (ddpixel.dwFlags & DDPF_RGB) { // RGB mode // what's the RGB mode switch(ddpixel.dwRGBBitCount) { case 15: // must be 5.5.5 mode { // use the _RGB16BIT555(r,g,b) macro } break; case 16: // must be 5.6.5 mode { // use the _RGB16BIT565(r,g,b) macro } break; case 24: // must be 8.8.8 mode { } break; case 32: // must be alpha(8).8.8.8 mode { } break; default: break; } // end switch } // end if else if (ddpixel.dwFlags & DDPF_PALETTEINDEXED8) { // 256 color palettized mode } // end if else { // something else??? more tests } // end else Fairly simple code, huh? A bit ugly granted, but that comes with the territory, baby! The real power of GetPixelFormat() comes into play when you don't set the video mode and you simply create a primary surface in a windowed mode. In that case, you'll have no idea about the properties of the video system and you must query the system. Otherwise, you won't know the color depth, pixel format, or even the resolution of the system. Now that you're a 16-bit expert, here's a demo! There's nothing to creating a 16-bit application—just make the call to SetDisplayMode() with 16 bits for the color depth, and that's it. As an example, here are the steps you would take to create a full-screen, 16-bit color mode in DirectDraw: LPDIRECTDRAW7 lpdd = NULL; // used to get directdraw7 DDSURFACEDESC2 ddsd; // surface description LPDIRECTDRAWSURFACE7 lpddsprimary = NULL; // primary surface // create IDirectDraw7and test for error if (FAILED(DirectDrawCreateEx(NULL, (void **)&lpdd, IID_IDirectDraw7, NULL))) return(0); // set cooperation level to requested mode if (FAILED(lpdd->SetCooperativeLevel(main_window_handle, DDSCL_ALLOWMODEX | DDSCL_FULLSCREEN | DDSCL_EXCLUSIVE | DDSCL_ALLOWREBOOT))) return(0); // set the display mode to 16 bit color mode if (FAILED(lpdd->SetDisplayMode(640,480,16,0,0))) return(0); // Create the primary surface memset(&ddsd,0,sizeof(ddsd)); ddsd.dwSize = sizeof(ddsd); ddsd.dwFlags = DDSD_CAPS; // set caps for primary surface ddsd.ddsCaps.dwCaps = DDSCAPS_PRIMARYSURFACE; // create the primary surface lpdd->CreateSurface(&ddsd,&lpddsprimary,NULL); And that's all there is to it. At this point, you would see a black screen (possibly garbage if the primary buffer memory has data in it). To simplify the discussion, assume that you already tested the pixel format and found that it's RGB 16-bit 5.6.5 mode—which is correct, because you set the mode! In the worst-case scenario, however, it could have been the 5.5.5 format. Anyway, to write a pixel to the screen, you must
Here's the code for a rough 16-bit plot pixel function: void Plot_Pixel16(int x, int y, int red, int green, int blue, LPDIRECTDRAWSURFACE7 lpdds) { // this function plots a pixel in 16-bit color mode // very inefficient... DDSURFACEDESC2 ddsd; // directdraw surface description // first build up color WORD USHORT pixel = __RGB16BIT565(red,green,blue); // now lock video buffer DDRAW_INIT_STRUCT(ddsd); lpdds->Lock(NULL,&ddsd,DDLOCK_WAIT | DDLOCK_SURFACEMEMORYPTR,NULL); // write the pixel // alias the surface memory pointer to a USHORT ptr USHORT *video_buffer = ddsd.lpSurface; // write the data video_buffer[x + y*(ddsd.lPitch >> 1)] = pixel; // unlock the surface lpdds->Unlock(NULL); } // end Plot_Pixel16 Notice the use of DDRAW_INIT_STRUCT(ddsd), which is a simple macro that zeros out the structure and sets its dwSize field. I'm getting tired of doing it the long way. Here's the macro definition: // this macro should be on one line #define DDRAW_INIT_STRUCT(ddstruct) { memset(&ddstruct,0,sizeof(ddstruct)); ddstruct.dwSize=sizeof(ddstruct); } For example, to plot a pixel on the primary surface at (10,30) with RGB values (255,0,0), you would do something like this: Plot_Pixel16(10,30, // x,y 255,0,0, // rgb lpddsprimary); // surface to draw on Although the function seems reasonably simple, it's extremely inefficient. There are a number of optimizations that you can take advantage of. The first problem is that the function locks and unlocks the sent surface each time. This is totally unacceptable. Locking/unlocking can take hundreds of microseconds on some video cards, and maybe even longer. The bottom line is that in a game loop, you should lock a surface once, do all the manipulation you're going to do with it, and unlock it when you're done, as shown in Figure 7.3. That way you don't have to keep locking/unlocking, zeroing out memory, etc. For example, the memory fill of the DDSURFACEDESC2 structure probably takes longer than the pixel plot! Not to mention that the function isn't inline and the function overhead is probably killing you. Figure 7.3. DirectDraw surfaces should be locked as little as possible.These are the types of things that a game programmer needs to keep in mind. You aren't writing a word processor program here—you need speed! Here's another version of the function with a little bit of optimization, but it can still be 10 times faster: inline void Plot_Pixel_Fast16(int x, int y, int red, int green, int blue, USHORT *video_buffer, int lpitch) { // this function plots a pixel in 16-bit color mode // assuming that the caller already locked the surface // and is sending a pointer and byte pitch to it // first build up color WORD USHORT pixel = __RGB16BIT565(red,green,blue); // write the data video_buffer[x + y*(lpitch >> 1)] = pixel; } // end Plot_Pixel_Fast16 I still don't like the multiply and shift, but this new version isn't bad. You can get rid of both the multiply and shift with a couple of tricks. First, the shift is needed because lPitch is memory width in bytes. However, because you're assuming that the caller already locked the surface and queried the memory pointer and pitch from the surface, it's a no-brainer to add one more step to the process to compute a WORD or 16-bit strided version of lpitch, like this: int lpitch16 = (lpitch >> 1); Basically, lpitch16 is now the number of 16-bit WORDs that make up a video line. With this new value, you can rewrite the functions once again, like this: inline void Plot_Pixel_Faster16(int x, int y, int red, int green, int blue, USHORT *video_buffer, int lpitch16) { // this function plots a pixel in 16-bit color mode // assuming that the caller already locked the surface // and is sending a pointer and byte pitch to it // first build up color WORD USHORT pixel = _RGB16BIT565(red,green,blue); // write the data video_buffer[x + y*lpitch16] = pixel; } // end Plot_Pixel_Faster16 That's getting there! The function is inline and has a single multiply, addition, and memory access. Not bad, but it could be better! The final optimization is to use a huge lookup table to get rid of the multiply, but this may not be needed because integer multiplies are getting down to single cycles on newer Pentium X architectures. It is a way to speed things up, however. On the other hand, you can get rid of the multiply by using a number of shift-adds. For example, assuming a perfectly linear memory mode (without any extra stride per line), you know that it's exactly 1,280 bytes from one video line to another in a 640x480 16-bit mode. Therefore, you need to multiply y by 640 because the array access will use automatic pointer arithmetic and scale anything in the [] array operator by a factor of 2 (2 bytes per USHORT WORD). Anyway, here's the math: y*640 = y*512 + y*128 512 is equal to 29, and 128 is equal to 27. Therefore, if you were to shift y to the left 9 times and then add that to y shifted to the left 7 times, the result should be equivalent to y*640, or mathematically: y*640 = y*512 + y*128 = (y << 9) + (y << 7) That's it! If you aren't familiar with this trick, take a look at Figure 7.4. Basically, shifting any binary-encoded number to the right is the same as dividing by 2 and shifting to the left is the same as multiplying by 2. Furthermore, multiple shifts accumulate. Hence, you can use this property to perform very fast multiplication on numbers that are powers of 2. However, if the numbers aren't powers of 2, you can always break them into a sum of products that are—as in the previous case. Now, optimizations like these aren't really important on Pentium II+ processors since they can usually multiply in a single clock, but on older processors or other platforms like the Game Boy Advance, etc. knowing tricks always come in handy. Figure 7.4. Using binary shifting to multiply and divide.NOTE You'll see a lot more of these tricks when you get to the Chapter 11, "Algorithms, Data Structures, Memory Management, and Multithreading." For an example of using the 16-bit modes to write pixels to the screen, take a look at DEMO7_1.CPP|EXE on the CD. The program basically implements what you've done here and blasts random pixels to the screen. Take a look at the code and note that you don't need a palette anymore, which is kind of nice <BG>. By the way, the code is in the standard T3D Game Engine template, so the only things you need to really look at are Game_Init() and Game_Main(). The contents of Game_Main() are shown here: int Game_Main(void *parms = NULL, int num_parms = 0) { // this is the main loop of the game, do all your processing // here // for now test if user is hitting ESC and send WM_CLOSE if (KEYDOWN(VK_ESCAPE)) SendMessage(main_window_handle,WM_CLOSE,0,0); // plot 1000 random pixels to the primary surface and return // clear ddsd and set size, never assume it's clean DDRAW_INIT_STRUCT(ddsd); // lock the primary surface if (FAILED(lpddsprimary->Lock(NULL, &ddsd, DDLOCK_SURFACEMEMORYPTR | DDLOCK_WAIT, NULL))) return(0); // now ddsd.lPitch is valid and so is ddsd.lpSurface // make a couple aliases to make code cleaner, so we don't // have to cast int lpitch16 = (int)(ddsd.lPitch >> 1); USHORT *video_buffer = (USHORT *)ddsd.lpSurface; // plot 1000 random pixels with random colors on the // primary surface, they will be instantly visible for (int index=0; index < 1000; index++) { // select random position and color for 640x480x16 int red = rand()%256; int green = rand()%256; int blue = rand()%256; int x = rand()%640; int y = rand()%480; // plot the pixel Plot_Pixel_Faster16(x,y,red,green,blue,video_buffer,lpitch16); } // end for index // now unlock the primary surface if (FAILED(lpddsprimary->Unlock(NULL))) return(0); // return success or failure or your own return code here return(1); } // end Game_Main 24/32-Bit High-Color ModeOnce you've mastered 16-bit mode, 24-bit and 32-bit modes are trivial. I'll begin with 24-bit mode because it's simpler than 32-bit mode—which is not a surprise! 24-bit mode uses exactly one byte per channel of RGB blue. Thus, there's no loss and a total of 256 shades per channel, giving a total possible number of colors of 256x256x256 = 16.7 million. The bits for red, green, and blue are encoded just as they were in 16-bit mode, except that you don't have to worry about one channel using more bits than another. Because there's one byte per channel and three channels, there are three bytes per pixel. This makes for really ugly addressing, as shown in Figure 7.5. Alas, writing pixels in pure 24-bit mode is rather contrived, as shown in the following 24-bit version of the pixel-writing function: inline void Plot_Pixel_24(int x, int y, int red, int green, int blue, UCHAR *video_buffer, int lpitch) { // this function plots a pixel in 24-bit color mode // assuming that the caller already locked the surface // and is sending a pointer and byte pitch to it // in byte or 8-bit math the proper address is: 3*x + y*lpitch // this is the address of the low order byte which is the Blue channel // since the data is in RGB order DWORD pixel_addr = (x+x+x) + y*lpitch; // write the data, first blue video_buffer[pixel_addr] = blue; // now red video_buffer[pixel_addr+1] = green; // finally green video_buffer[pixel_addr+2] = red; } // end Plot_Pixel_24 Figure 7.5. Three-byte RGB addressing is ugly.WARNING Many video cards don't support 24-bit color mode. They support only 32-bit color, which is usually 8 bits of alpha transparency and then 24 bits of color. This is due to addressing constraints. So DEMO7_2.EXE may not work on your system. The function takes as parameters the x,y, along with the RGB color, and finally the video buffer starting address and the memory pitch in bytes. There's no point in sending the memory pitch or the video buffer in some WORD length because there isn't any data type that's three bytes long. Hence, the function basically starts addressing the video buffer at the requested pixel location and then writes the blue, green, and red bits for the pixel. Here's a macro to build an RGB 24-bit word: // this builds a 24 bit color value in 8.8.8 format #define _RGB24BIT(r,g,b) ((b) + ((g) << 8) + ((r) << 16) ) For an example of 24-bit mode, take a look at DEMO7_2.CPP|EXE on the CD. It basically mimics the functionality of DEMO7_1.CPP, but in 24-bit mode. Moving on to 32-bit color, the pixel setup is a little different, as shown in Figure 7.6. In 32-bit mode, the pixel data is arranged in the following two formats: Alpha(8).8.8.8— This format uses eight bits for alpha or transparency information (or sometimes other information) and then eight bits for each channel: red, green, and blue. However, where simple bitmapping is concerned, you can usually disregard the alpha information and simply write eights to it. The nice thing about this mode is that it's 32 bits per pixel, which is the fastest possible memory addressing mode for a Pentium. X(8).8.8.— Similar to the preceding mode, except in this mode the upper eight bits of the color WORD are "don't care's" or irrelevant. However, I still suggest setting them to zeroes to be safe. You may say, "This mode seems like a 24-bit mode, so why have it?" The answer is that many video cards can't address on three-byte boundaries, so the fourth byte is just for alignment. Figure 7.6. 32-bit RGB pixel encodings.Now, take a look at a macro to create a 32-bit color WORD: // this builds a 32 bit color value in A.8.8.8 format (8-bit alpha mode) #define _RGB32BIT(a,r,g,b) ((b) + ((g) << 8) + ((r) << 16) + ((a) << 24)) Then all you need to do is change your pixel-plotting function to use the new macro and take advantage of the four-byte-per-pixel data size. Here it is inline void Plot_Pixel_32(int x, int y, int alpha,int red, int green, int blue, UINT *video_buffer, int lpitch32) { // this function plots a pixel in 32-bit color mode // assuming that the caller already locked the surface // and is sending a pointer and DWORD aligned pitch to it // first build up color WORD UINT pixel = __RGB32BIT(alpha,red,green,blue); // write the data video_buffer[x + y*lpitch32] = pixel; } // end Plot_Pixel_32 This should look familiar. The only thing hidden is the fact that lpitch32 is the byte pitch divided by four, so it's a DWORD or 32-bit WORD stride. With that all in mind, check out DEMO7_3.CPP|EXE. It's the same pixel-plotting demo, but in 32-bit mode. It should work on your machine because more video cards support 32-bit mode than pure 24-bit mode. All righty, then! I think I've belabored high-color modes enough that you can work with them and convert any 8-bit color code that you want. Remember, I can't assume that everyone has a Pentium IV 2.0GHz with a GeForce III 3D Accelerator. Sticking to 8-bit color is a good way to get your programs running then you can move to 16-bit or higher modes. |