Using the BlitterIf you've been programming in DOS, not only have you been stuck in a quasi-32-bit world (even with a DOS extender), but I'll bet you've never been able to use hardware acceleration for 2D/3D graphics without a driver from the manufacturer or a fat third-party library. Hardware acceleration has been around since way before DOOM, but game programmers could rarely use it because it was more of a Windows thing. However, with DirectX you can take total advantage of all acceleration—graphics, sound, input, networking, etc. But the coolest thing is finally being able to use the hardware blitter to move bitmaps and do fills! Let me show you how it works… Normally, when you want to draw a bitmap or fill a video surface, you have to do it manually, pixel by pixel and so forth. For example, take a look at Figure 7.13, which depicts an 8x8, 256-color bitmap. Imagine that you want to copy this image to a video or offscreen buffer at position (x,y) that's 640x480 with linear pitch. Here's the code to do it: UCHAR *video_buffer; // points to VRAM or offscreen surface UCHAR bitmap[8*8]; // holds our bitmap in row major form // crude bitmap copy // outer loop is for each row for (int index_y=0; index_y<8; index_y++) { // inner loop for each pixel of each row for (int index_x=0; index_x<8; index_x++) { // copy the pixel without transparency video_buffer[x+index_x + (y+index_y)*640] = bitmap[index_x + index_y*8]; } // end for index_x } // end for index_y Figure 7.13. An 8x8, 256-color bitmap.Now take a few minutes (or seconds, if you're a cyborg) and make sure you completely understand what's going on and could write this yourself without looking. Refer back to Figure 7.13 to help visualize it. Basically, you're simply copying a rectangular bitmap of pixels from one place in memory to another. There are obviously a number of optimizations and problems with this function. First, I'll talk about the problems: Problem 1: The function is incredibly slow. Problem 2: The function doesn't take into consideration transparency, meaning that if you have a game object in the bitmap that has black around it, the black will be copied. This problem is shown in Figure 7.14. You need to add code for this. Figure 7.14. Transparent pixels aren't copied to the destination surface during blitting.As far as optimizations go, you can do the following: Optimization 1: Get rid of all the multiplication and most of the addition by pre-computing starting addresses in the source and destination buffers and then increment pointers for each pixel. Optimization 2: Use memory fills for nontransparent runs of pixels (advanced). Let's start with making a real function that takes transparency into consideration (use color 0), and that uses better addressing to speed things up and get rid of the multiplies. Here's one example: void Blit8x8(int x, int y, UCHAR *video_buffer, UCHAR *bitmap) { // this function blits the image sent in bitmap to the // destination surface pointed to by video_buffer // the function assumes a 640x480x8 mode with linear pitch // compute starting point into video buffer // video_buffer = video_buffer + (x + y*640) video_buffer+= (x + (y << 9) + (y << 7)); UCHAR pixel; // used to read/write pixels // main loop for (int index_y=0; index_y < 8; index_y++) { // inner loop, this is where it counts! for (int index_x=0; index_x < 8; index_x++) { // copy pixel, test for transparent though if (pixel = bitmap[index_x]) video_buffer[index_x] = pixel; } // end for index_x // advance pointers bitmap+=8; // next line in bitmap video_buffer+=640; // next line in video_buffer } // end for index_y } // end Blit8x8 This version of the blitter function is many times faster than the previous one with multiplication, and this one even works with bitmaps that have transparent pixels—wow! The point of this exercise is to show you how something so simple can take up so many processor cycles. If you count cycles, the function is still crap. There's the overhead of the loop mechanics, of course, but the guts of the function are still ugly. A test for transparency must be made, two array accesses, a write to memory… yuck, yuck, yuck! This is why there are accelerators. A hardware blitter can do this in its sleep, which is why you need to use the hardware to blit images down. That way you can save processor cycles for other things, like AI and physics! Not to mention that the blitter function just shown is really stupid. It is hard-coded to 640x480x256, doesn't do any clipping (more logic), and only works for 8-bit images. Now that I've shown you the old way to draw bitmaps, here's the first look at the blitter and how to use it to do memory fills. Then you'll see how to copy images from one surface to another. Later in the chapter, you'll use the blitter to draw game objects, but take your time. Using the Blitter for Memory FillingAlthough accessing the blitter under DirectDraw is trivial compared to programming it manually, it's still a reasonably complex piece of hardware. Therefore, whenever I get my hands on a new piece of video hardware, I always like to try something simple first before I try pushing the envelope. So let me show you how to do something that's very useful—memory fills. Memory filling simply means filling a region of VRAM with some value. You've done this a number of times by locking a surface and then using memset() or memcpy() to manipulate and fill the surface memory, but there are a number of problems with this approach. First, you're using the main CPU to do the memory fill, so the main bus is part of the transfer. Second, the VRAM that makes up a surface may not be totally linear. In that case, you'll have to do a line-by-line fill or move. However, with the hardware blitter you can directly fill or move chunks of VRAM or DirectDraw surfaces instantly! The two functions that DirectDraw supports for blitting are IDIRECTDRAWSURFACE7:: Blt() and IDIRECTDRAWSURFACE7::BltFast(). Their prototypes are shown here: HRESULT Blt(LPRECT lpDestRect, // dest RECT LPDIRECTDRAWSURFACE7 lpDDSrcSurface, // dest surface LPRECT lpSrcRect, // source RECT DWORD dwFlags, // control flags LPDDBLTFX lpDDBltFx); // special fx (very cool!) The parameters are defined here and illustrated graphically in Figure 7.15: Figure 7.15. Blitting from source to destination.lpDestRect is the address of a RECT structure that defines the upper-left and lower-right points of the rectangle to blit to on the destination surface. If this parameter is NULL, the entire destination surface will be used. lpDDSrcSurface is the address of an IDIRECTDRAWSURFACE7 interface for the DirectDraw surface to be used as the source of the blit. lpSrcRect is the address of a RECT structure that defines the upper-left and lower-right points of the rectangle to blit from on the source surface. If this parameter is NULL, the entire source surface will be used. dwFlags determines the valid members of the next parameter, which is a DDBLTFX structure. Within DDBLTFX, special behaviors such as scaling, rotation, and so on can be controlled, as well as color key information. The valid flags for dwFlags are shown in Table 7.3. lpDDBltFx is a structure containing special blitter-relating information about the blit you're requesting. The data structure follows: typedef struct _DDBLTFX { DWORD dwSize; // the size of this structure in bytes DWORD dwDDFX; // type of blitter fx DWORD dwROP; // Win32 raster ops that are supported DWORD dwDDROP; // DirectDraw raster ops that are supported DWORD dwRotationAngle; // angle for rotations DWORD dwZBufferOpCode; // z-buffer fields (advanced) DWORD dwZBufferLow; // advanced.. DWORD dwZBufferHigh; // advanced.. DWORD dwZBufferBaseDest; // advanced.. DWORD dwZDestConstBitDepth; // advanced.. union { DWORD dwZDestConst; // advanced.. LPDIRECTDRAWSURFACE lpDDSZBufferDest; // advanced.. }; DWORD dwZSrcConstBitDepth; // advanced.. union { DWORD dwZSrcConst; // advanced.. LPDIRECTDRAWSURFACE lpDDSZBufferSrc; // advanced.. }; DWORD dwAlphaEdgeBlendBitDepth; // alpha stuff (advanced) DWORD dwAlphaEdgeBlend; // advanced.. DWORD dwReserved; // advanced.. DWORD dwAlphaDestConstBitDepth; // advanced.. union { DWORD dwAlphaDestConst; // advanced.. LPDIRECTDRAWSURFACE lpDDSAlphaDest; // advanced.. }; DWORD dwAlphaSrcConstBitDepth; // advanced.. union { DWORD dwAlphaSrcConst; // advanced.. LPDIRECTDRAWSURFACE lpDDSAlphaSrc; // advanced.. }; union // these are very important { DWORD dwFillColor; // color word used for fill DWORD dwFillDepth; // z filling (advanced) DWORD dwFillPixel; // color fill word for RGB(alpha) fills LPDIRECTDRAWSURFACE lpDDSPattern; }; // these are very important DDCOLORKEY ddckDestColorkey; // destination color key DDCOLORKEY ddckSrcColorkey; // source color key } DDBLTFX,FAR* LPDDBLTFX; (Note that I've boldfaced useful fields.) (Note that I've boldfaced the most useful flags.) If you're losing your mind, that's fantastic—it shows that you're following me <BG>. Now, take a look at BltFast(): HRESULT BltFast( DWORD dwX, // x-position of blit on destination DWORD dwY, // y-position of blit on destination LPDIRECTDRAWSURFACE7 lpDDSrcSurface, // source surface LPRECT lpSrcRect, // source RECT to blit from DWORD dwTrans); // type of transfer dwX and dwY are the (x,y) coordinates to blit to on the destination surface. lpDDSrcSurface is the address of the IDIRECTDRAWSURFACE7 interface for the DirectDraw surface to be used as the source of blit. lpSrcRect is the address of the source RECT that defines the upper-left and lower-right points of the rectangle to blit from on the source surface. dwTrans is the type of blitter operation. Table 7.4 shows the possible values. (Note that I've boldfaced the most useful flags.) All right, the first question is, "Why are there two different blitter functions?" The answer should be apparent from the functions themselves: Blt() is the full-blown kitchen sink model, while BltFast() is simpler but has fewer options. Furthermore, Blt() uses DirectDraw clippers while BltFast() doesn't. This means that BltFast() is faster than Blt() in the HEL by about 10%, and may even be faster in hardware (if the hardware is crappy and sucks at clipping). The point is, use Blt() if you need clipping, and use BltFast() if you don't. Let me show you how to use the Blt() function to fill a surface. This will be reasonably simple because there isn't a source surface (only a destination surface). A lot of the parameters, therefore, can be NULL. To do a memory fill, you must perform the following steps:
Here's the code to fill a region of an 8-bit surface with a color: DDBLTFX ddbltfx; // the blitter fx structure RECT dest_rect; // used to hold the destination RECT // first initialize the DDBLTFX structure DDRAW_INIT_STRUCT(ddbltfx); // now set the color word info to the color we desire // in this case, we are assuming an 8-bit mode, hence, // we'll use a color index from 0-255, but if this was a // 16/24/32 bit example then we would fill the WORD with // the RGB encoding for the pixel – remember! ddbltfx.dwFillColor = color_index; // or RGB for 16+ modes! // now set up the RECT structure to fill the region from // (x1,y1) to (x2,y2) on the destination surface dest_rect.left = x1; dest_rect.top = y1; dest_rect.right = x2; dest_rect.bottom = y2; // make the blitter call lpddsprimary->Blt(&dest_rect, // pointer to dest RECT NULL, // pointer to source surface NULL, // pointer to source RECT DDBLT_COLORFILL | DDBLT_WAIT, // do a color fill and wait if you have to &ddbltfx); // pointer to DDBLTFX holding info NOTE There's one little detail with any of the RECT structures that you send to most DirectDraw functions: In general, they're upper-left inclusive, but lower-right exclusive. In other words, if you send a RECT that's (0,0) to (10,10), the actual rectangle scanned will be (0,0) to (9,9) inclusive. So keep that in mind. Basically, if you want to fill the entire 640x480 screen, you would send upper-left as (0,0) and lower-right as (640, 480). The important things to notice are the setup and that both the source surface and RECT are NULL. This makes sense because you're using the blitter to fill with a color, not to copy data from one surface to another. Okay, let's move on, my little leprechaun. The preceding example was for an 8-bit surface; the only change you need to make for a high-color mode in 16/24/32-bit mode is to simply change the value in ddbltfx.dwFillColor to reflect the pixel value that you want the fill to be performed in, that is you would build the actual RGB value of the pixel you want transparent. Isn't that cool? For example, if the display happened to be a 16-bit mode and you wanted to fill the screen with green, the following code would work: ddbltfx.dwFillColor = __RGB16BIT565(0,255,0); Everything else in the preceding 8-bit example would stay the same. DirectDraw isn't that bad, huh? To see the blitter hardware in action, I've created a little psychedelic demo for you called DEMO7_6.CPP|EXE. It puts the system into 640x480x16-bit mode and then fills different regions of the screen with random color. You'll see about a zillion colored rectangles per second getting blitted to the screen (try turning the lights off and tripping out on it). Take a look at the Game_Main(); it's almost trivial: int Game_Main(void *parms = NULL, int num_parms = 0) { // this is the main loop of the game, do all your processing // here DDBLTFX ddbltfx; // the blitter fx structure RECT dest_rect; // used to hold the destination RECT // make sure this isn't executed again if (window_closed) return(0); // for now test if user is hitting ESC and send WM_CLOSE if (KEYDOWN(VK_ESCAPE)) { PostMessage(main_window_handle,WM_CLOSE,0,0); window_closed = 1; } // end if // first initialize the DDBLTFX structure DDRAW_INIT_STRUCT(ddbltfx); // now set the color word info to the color we desire // in this case, we are assuming an 8-bit mode, hence, // we'll use a color index from 0-255, but if this was a // 16/24/32 bit example then we would fill the WORD with // the RGB encoding for the pixel - remember! ddbltfx.dwFillColor = __RGB16BIT565(rand()%256, rand()%256, rand()%256); // get a random rectangle int x1 = rand()%SCREEN_WIDTH; int y1 = rand()%SCREEN_HEIGHT; int x2 = rand()%SCREEN_WIDTH; int y2 = rand()%SCREEN_HEIGHT; // now set up the RECT structure to fill the region from // (x1,y1) to (x2,y2) on the destination surface dest_rect.left = x1; dest_rect.top = y1; dest_rect.right = x2; dest_rect.bottom = y2; // make the blitter call if (FAILED(lpddsprimary->Blt(&dest_rect, // pointer to dest RECT NULL, // pointer to source surface NULL, // pointer to source RECT DDBLT_COLORFILL | DDBLT_WAIT, // do a color fill and wait if you have to &ddbltfx))) // pointer to DDBLTFX holding info return(0); // return success or failure or your own return code here return(1); } // end Game_Main Now that you know how to use the blitter to fill, let me show you how to use it to copy data from surface to surface. This is where the real power of the blitter comes into play. It's the foundation for the sprite or blitter object engine that you're going to make in a little while. Copying Bitmaps from Surface to SurfaceThe whole point of the blitter is to copy rectangular bitmaps from some source memory to destination memory. This may involve copying the entire screen, or small bitmaps that represent game objects. In either case, you need to learn how to instruct the blitter to copy data from one surface to another. Actually, you already know how to do this and may not realize it. The blitter fill demo will do the job with a couple of changes. When you're using the Blt() function, you basically send a source RECT and surface and a destination RECT and surface to perform the blit. The blitter will then copy the pixels from the source RECT to the destination RECT. The source and destination surface can be the same (surface to surface copy or move), but they're usually different. In general, the latter is the basis for most sprite engines. (A sprite is a bitmap game image that moves around the screen.) At this point you know how to create a primary surface and secondary surface that serves as a back buffer, but you don't know how to create plain offscreen surfaces that aren't related to the primary surface. You can't blit them if you can't make them. Thus, I'm going to hold off on showing you the general blitting case of any surface to the primary surface until I've shown you how to blit from the back buffer to the primary surface. Then the transition from generic surface to primary or back buffer will be trivial. All you need to do to make a blit from any two surfaces (the back buffer to the primary surface, for example) is set the RECTs up correctly and make a call to Blt() with the right parameterization. Take a look at Figure 7.15. Imagine that you want to copy the RECT defined by (x1,y1) to (x2,y2) on the source surface (the back buffer in this case) to (x3,y3) to (x4,y4) on the destination surface (the primary surface in this case). Here's the code: RECT source_rect, // used to hold source RECT dest_rect; // used to hold the destination RECT // set up the RECT structure to fill the region from // (x1,y1) to (x2,y2) on the destination surface source_rect.left = x1; source_rect.top = y1; source_rect.right = x2; source_rect.bottom = y2; // now set up the RECT structure to fill the region from // (x3,y3) to (x4,y4) on the destination surface dest_rect.left = x3; dest_rect.top = y3; dest_rect.right = x4; dest_rect.bottom = y4; // make the blitter call lpddsprimary->Blt(&dest_rect, // pointer to dest RECT lpddsback, // pointer to source surface &source_rect, // pointer to source RECT DDBLT_WAIT, // control flags NULL); // pointer to DDBLTFX holding info That was easy, huh? Of course, there are still a few details I'm leaving out, such as clipping and transparency. I'll talk about clipping first. Take a look at Figure 7.16, which depicts a bitmap that's drawn to a surface with and without clipping. Blitting without clipping is obviously a problem if the bitmap extends past the rectangle of the destination surface. Memory may be overwritten and so forth, so DirectDraw supports clipping via the IDirectDrawClipper interface. Or, if you wrote your own bitmap rasterizer, as you did in the example Blit8x8(), you could always add clipping code. That will slow things down, however. The second issue pertaining to blitting is transparency. Figure 7.16. The basic bitmap clipping problem.When you draw a bitmap, the image is always within a rectangular matrix of pixels. However, you don't want all those pixels copied when you blit. In many cases, you select a color, such as black, blue, green, or whatever, to serve as a transparent color that isn't copied (you saw this implemented in Blit8x8()). DirectDraw also has support for this called color keys, which I will also talk about shortly. Before you move on to clipping, I'd like to show you a demo of blitting from the back buffer to the primary surface. Take a look at DEMO7_7.CPP|EXE on the CD. The only problem is that I haven't shown you how to load bitmaps from disk yet, so I can't really blit anything cool—bummer! So what I did was draw a gradient of green in 16-bit color mode from top to bottom on the back buffer, and then use this as the source data. You'll see a bunch of gradient rectangles copied to the primary surface at warp speed. Here's the source from Game_Main() for your review: int Game_Main(void *parms = NULL, int num_parms = 0) { // this is the main loop of the game, do all your processing // here RECT source_rect, // used to hold the destination RECT dest_rect; // used to hold the destination RECT // make sure this isn't executed again if (window_closed) return(0); // for now test if user is hitting ESC and send WM_CLOSE if (KEYDOWN(VK_ESCAPE)) { PostMessage(main_window_handle,WM_CLOSE,0,0); window_closed = 1; } // end if // get a random rectangle for source int x1 = rand()%SCREEN_WIDTH; int y1 = rand()%SCREEN_HEIGHT; int x2 = rand()%SCREEN_WIDTH; int y2 = rand()%SCREEN_HEIGHT; // get a random rectangle for destination int x3 = rand()%SCREEN_WIDTH; int y3 = rand()%SCREEN_HEIGHT; int x4 = rand()%SCREEN_WIDTH; int y4 = rand()%SCREEN_HEIGHT; // now set up the RECT structure to fill the region from // (x1,y1) to (x2,y2) on the source surface source_rect.left = x1; source_rect.top = y1; source_rect.right = x2; source_rect.bottom = y2; // now set up the RECT structure to fill the region from // (x3,y3) to (x4,y4) on the destination surface dest_rect.left = x3; dest_rect.top = y3; dest_rect.right = x4; dest_rect.bottom = y4; // make the blitter call if (FAILED(lpddsprimary->Blt(&dest_rect, // pointer to dest RECT lpddsback, // pointer to source surface &source_rect,// pointer to source RECT DDBLT_WAIT, // control flags NULL))) // pointer to DDBLTFX holding info return(0); // return success or failure or your own return code here return(1); } // end Game_Main Also, in Game_Init() I used a little inline assembly to do a DWORD or 32-bit line of two 16-bit pixels at once in RGB.RGB format instead of a slower 8-bit fill. Here's that code: _asm { CLD ; clear direction of copy to forward MOV EAX, color ; color goes here MOV ECX, (SCREEN_WIDTH/2) ; number of DWORDS goes here MOV EDI, video_buffer ; address of line to move data REP STOSD ; send the Pentium X on its way... } // end asm Basically, the preceding code implements the following C/C++ loop: for (DWORD ecx = 0, DWORD *edi = video_buffer; ecx < (SCREEN_WIDTH/2); ecx++) edi[ecx] = color; If you don't know assembly language, don't freak out. I just like to use it now and then for little things like this. Also, it's good practice to use the inline assembler; it keeps you on your toes! As an exercise, see if you can make the program work only on the primary surface. Simply delete the back buffer code, draw the image on the primary surface, and then run the blitter with the source and destination as the same surface. Watch what happens… |