ApplicationGiven a modular DT with well-defined interfaces and a satisfactory implementation, it can be applied to weapon selection. Most of the remaining work lies in building a scaffolding to support it, providing the right inputs and outputs. The exploitation of the DT is relatively straightforward, comparable to replacing the voting system with another component. Listing 27.2 shows how the senses are used to interpret the features, which are used by the classification process to predict the fitness. Listing 27.2 A Function for Selecting the Weapon Called on Regular Intervals When It's Necessary to Changefunction select_weapon() # use senses to determine health, enemy distance... env = interpret_features() # find best weapon among the ones available max = 0 for each (weapon,ammo) in inventory # compute the fitness based on concatenated features fitness = dt.predict(env + weapon + ammo) # only remember if it's better if fitness > max then fitness = max best = weapon end if end for # weapon can be selected return best end function The induction of the DT and computing the sample for learning are slightly more complex. It involves three main phases:
The implementation is built upon the same techniques used to monitor the effectiveness of rockets. The following sections explain these stages in greater detail. Interpreting Environment FeaturesThe features of the environment are collected using the senses from the interfaces discussed in Chapter 24, "Formalizing Weapon Choice," and other specifications (for instance, vision, inventory, and physics). The results are a set of predictor variables, with the representation shown in Table 27.1.
These variables are the most important features to incorporate into the model, although we could easily add and remove some as necessary to find the ideal balance. These predictor variables are used by the weapon selection, but the response variable is needed for learning. The response is evaluated by monitoring the fight. Monitoring Fight EpisodesThe AI gathers four different types of information from the game, all relevant to the applicability of weapons. Like the animat learning target selection, an event-driven mechanism is used to identify hits (pain signals) and potentially misses (explosion only):
Identifying the cause of damage can be the most difficult task, but can be solved by checking the location of the pain event, compared with the aiming direction. Alternatively, this information could be provided by the data structures used to store the messages. Computing the FitnessThe principle at the base of the voting system is that the fitness of a weapon depends on the situation. This also means the criteria used to evaluate the weapons changes depending on the conditions. It's somewhat difficult to go into this subject without considering high-level tactics (covered in Part VII), so we'll make a few assumptions. Looking at weapon selection alone, we want to take into account the following oversimplified criteria:
Because the overall fitness will represent these different criteria in different situations, we need to make sure that they're vaguely on the same scale. To do this, we'll rescale the values so that they fall into the range [0...100] as closely as possible, as summarized in Listing 27.3. Listing 27.3 This Function Learns the Desired Fitness of Weapons Based on the Features of the Environmentfunction learn_weapon(weapon,episode) # gather the information from the senses env = interpret_features() # compute the fitness in terms of the monitored information if episode.self_health < 25 then fitness = -episode.self_damage else if episode.enemy_health < 40 then fitness = episode.accuracy else if episode.enemy_position.y > 0 then # enemy is facing away fitness = episode.max_potential else fitness = episode.enemy_damage_per_second # incrementally induce the fitness from concatenated features dt.increment(env + weapon, fitness) end function
Biologically Plausible ErrorsBy analyzing the data with statistics, it's possible to see how the problem is shaping up. Using a histogram of the potential damage per second, the main feature (distance of the enemy) offers visible trends. For example, the super shotgun is extremely efficient up close, but tails off as the enemy gets farther away. This is understandably caused by the spread of the fire. On the other hand, some trends are quite surprising. The railgun performs well at a distance, as expected. But up close, the performance is higher than expected. Together with this, traveling backward imposes no additional difficulties on the aiming, so weapons are just as efficient—regardless of the direction of travel. Quite literally, the animats are like the mobile turret of a tank, and just as efficient. The constant aiming errors were sufficient in the previous part to produce realistic aiming, but we need a more plausible error model for higher-level behaviors, such as weapon selection, to be more humanlike. The weapon selection is already very realistic, but believability could be taken a step further by increasing the variability of the accuracy. To achieve this, we'll improve the aiming error model to take into account movement and relative direction of travel. The more the animat moves, the less accurately it will turn; also, moving forward is more accurate than running backward. |