Game Engine Design: December 2013

Thursday, 12 December 2013

Navigation in the Insomniac Engine

Navigation and AI
AI is an important part of making games today more fun and immersive. The personality of a NPC is based upon a myriad of things, how they navigate the world, perceive the world around them and react accordingly. One of the most direct ways a designer can give an AI personality is through how it navigates the world. But the designer shouldn't have to worry about the actual movement implementation, they should think about higher level things. As it is the industry is moving away from scripted AI anyways to a more dynamic solution.

The traditional way of doing AI and how it's done currently involves sending go to commands to an NPC. Meaning, the NPC gets the destination coordinates, finds a path to the destination and scans the path for impeding obstacles to avoid. This scanning system is also used to find targets in the level and manages the AI's threat perception. It tells the AI where to spawn, in an area that's safely away from any current threats.

World Representations
There have been various ways to lay out the world for the AI to traverse including a 2D grid, way-point graphs and poly/nav meshes. A combination of those elements is used as well with nav meshes gaining considerable popularity. The A* algorithm is used typically for path finding on the laid-out, navigation world as it is the most efficient and quickest.

The Insomniac AI uses nav in the following steps:

Game AI sets up the NPC's nav requirements, so basically the go-to command or destination of the NPC
The nav system performs pathfinding, most likely using the A* algorithm and smooths out the result
The steering system looks through the path to find any obstacles and how to avoid them
The game AI then takes the steering output and turns it into animation input
The final position and orientation of the NPC is acheived

Resistance Fall of Man used the above nav-mesh representation, where the designer would lay out the entire nav-mesh in Maya. The polys acted as nodes and the edges were the connections, used for pathfinding in A*. As it turned out, navigation was one of the big bottlenecks for the PS3 launch, and this could only support 8 navigating NPCs at one time. To fix this they implemented an AI lod system so NPCs at a distance did not use nav, but this was only during that initial ps3 experimental period, where developers were still figuring out how to program for the architecture.

The scale between ROFM and Resistance 2 also changed drastically. Three RFOM levels could make up one R2 level, and while level streaming took care of rendering the level out, the nav was way too large, almost by a poly density of 3x. So they removed the AI lod and bots could use the nav system from a distance, primarily because they were developing an 8-player co-op mode, which needed enough NPC's to entertain all the player, concurrently.

To optimize for this they removed the excess nav mesh loaded for the entire level and partitioned it into clusters so the SPU could have direct memory access. They also changed from convex, 8 sided polys to a triangular mesh representation which was better optimized. The triangle edges represented A* graph nodes and the polys were connections between them. This also opened the door for jump-up and jump-down distance customization in the search graph.

A Hierarchy

The path-finding system would use hierarchical path-finding, where the A* algorithm would calculate a coarse-path among the poly-clusters which then would be refined for the NPC's immediate path in that specific poly cluster. This ran into a problem however where it wouldn't give the shortest path the designer expected so path caching was added. This system allowed the NPC the check a request for a path against the last successful path. They also made cost estimates in A* more efficient, or rather give them more work to do like removing edge mid-point to edge mid-point and replacing it with a computed point on the edge that minimized cost. The final output paths didn't require as much smoothing as before and they just ended up throwing out the hierarchical path finding system. A* was so efficient it hardly used 10% of the navigaton SPU budget, even with a 100 NPC's on screen using only nav.

Path-finding was improved to add more parameters like the ability to jump up/down various distances, using one nav mesh for varied size NPC's, preferred corner radius etc. The game-play code itself was changed to use nav data as queries, where the AI synced the previous frame's results and set the query for the next frame, with all the code streamlined to batch all of the queries and offload it to the SPUs.

The SPU

The SPU's actual job involved several things including: finding a point on a mesh, finding a path, and looking for obstacles.

Give a volume and parameters, the mesh query algorithm outputs the closest point on the mesh, maintaining the NPC's tolerance radius from the boundary. The find path query simply used A* and path smoothing while obstacle processing gathered obstacles in each path which could be hand optimized for common routines.

The engines game loop is divided into pre-update, update and post update. In pre-update the engine reads in nav query and steering results from the previous frame and sets up the nav queries for the next frame. In the update the nav SPU job gets added, along with all the other jobs, (physics, animation). This job can theoretically run anywhere after the pre-update to the next frames pre-update, bypassing the post update. The last stage only handles syncing of animation and physics jobs before handing it off to the rederer.

For path smoothing, one big mesh is used so that string pull doesn't pull the NPC's tolerance away at corners, which is especially true for large NPCs, like the Titan, which would constantly try and cut across the edges.

Steering

The steering in R2 was very light weight and required that the NPCs be running at all times, which caused issues while cornering, because the animations would not slow down enough. The NPC needs to get to the first bend point in the path and also orient towards the next bend point. This was done in the nav SPU job which computed a bezier approach curve for the bend point.

This was done by using a swept sphere check in the current NPC facing direction to compute the start curve tangent, and doing a back projection swept sphere check from the first bend point in the opposite direction of the next bend point for the end curve tangent. These two tangents are then used to compute the bezier curve, with the NPC aimed at the mid-point for smoother approach.

For higher volumes of NPC's, like 8 player coop, a simply dynamic avoidance system was used. The SPU would just compute escape tangents for each obstacle it came across. Specifically it would look at all escape tangents spanning in a circle around the NPC. It would pick the direction based on the vacant hole closest to the direction, closest to the next path bend point. Sometimes though, it could pick the closest escape tangent pointing to a wall. If that were the case, the SPU job would flag the escape tangent, and steering would be expanded by 90 degrees.

Some levels contained about 100 grims using nav, running down narrow corridors which would cause issues while cornering (the animations again would not slow down enough). They spent nearly 80% of their time on nav, treating obstacles and flagging boundaries. This was prevalent enough to have one obstacle in ever three NPC's paths, so they introduced a special lookup cache which stored all the obstacles tangents that were 2x tolerance away from the boundary. If the obstacle was part of the cache when looked up, then the tangent flag was bypassed. These caches were rolled into the other queries and the SPU nav job now included this in its calculations.

Target Scanning

A large portion of the processing went into maintaining the tolerance away from the boundary for the NPCs. They tested out rendering the nav mash into the buffer and eroding to get the pixel threshold away from the boundary, but the designers wanted to shrink that threshold dynamically in game, so they implemented offset-polygon processing code.

Given any NPC, this code would shrink the nav-mesh polys by its threshold distance, making the resulting nav mesh smaller and thus less taxing.

Custom Links

Custom links are basically non-mesh links used for activities other than straight path-finding, like jumps, ladders, tunnels etc. They were traditionally implemented as point to point links, but the team decided to use a more logical, design-oriented approach. They used clue boxes, or just boxes that specified a link, rather than just a point, with a special front face. As long as the front face of the box intersected the nav mesh, then it was considered valid, and more designer friendly. These were also added in A* as nodes in the triangular mesh.

Two custom clues, whether clue boxes or placed in A*, could be connected by the designer to make a custom link (like a jump, or ladder climb). The NPC using the custom link has the ability to modify the link's cost in A* and subsequent path queries see that link cost until the NPC using it frees it.

The path finding query has a 64-bit field indicating custom abilities the AI using the query has, and every link type is associated with a bit. An animation hint is in the link's data, which the AI uses to blend to contextual animations that deal with traversing that link (like climbing up a ledge).

Navigation Mesh Generation

In R1 and R2 the designer and QA would initially generate the nav-mesh in Maya, but after R2 they moved to a custom nav-mesh poly layout editor, afterwards which they moved to a auto generating scheme. The auto generated nav meshes cuts a lot of designer time out and aids in iteration and allows for the generation of multiple nav meshes for different sized NPCs (which can replace the aforementioned offset-polygon processing). Insomniac adopted a "recast" navigation into their pipeline which was developed by Mikko Mononen.

The steps for nav mesh generation are as follows:

voxelize the scene geometry
build navigable voxels, which involved filtering and smoothing rendered voxels, followed by eroding resulting navigable voxels
build distance transform, or closest distance from boundary for voxel, where each catchment basin becomes a mesh chunk
trace catchment basin's contours with filtering which forms the nav mesh's boundary edges
triangulate/tesselate to generate final nav mesh

Building from voxels can lead to issues however, like introducing discontinuities, which forms a undesired poly-soup.

Because designers like to do some customization and any post-generation tweaking is in-flexible, custom override polys are used. These custom override polys are added in by designers themselves and added into the nav-mesh rendering process. They can be used for additional things however, like removing the nav-mesh in unwanted areas, and closing holes in unwanted nav mesh, due to holes in the scene.

The generation algorithm can also be tweaked to add or remove sleeping/dynamic elements from the nav mesh.

As seen above it can support dynamically rendered nav meshes, that is nav meshes added to new geometry added at run time. They would clear the voxel data out and use the nav mesh as render geometry. This approach can provide cost-cutting benefits like saved CPU and not using as many pre-loaded nav meshes.

All images and information taken from Reddy Sambavaram's, 2011 GDC talk.

Wednesday, 11 December 2013

Camera Systems & God of War

We viewed a GDC talk by Phil Wilkins in class, on God of War's dynamic camera system and how the team at Santa Monica improved on it for God of War 3.

Selection
At any one time there are a huge amount of cameras to choose from in game. A singular level can have a whopping 50 defined in it, a lot of which are designed to run simultaneously. But how do you decide which ones to pick, which ones have higher priority? The engine uses one of three systems, those being: zones placed in environments, actions triggered by the combat system or scripting.

Environment

The environment in God of War is marked up with zones, which to the camera system sort of looks like what's pictured above. Each of these zones reference one or more cameras and when the hero enters a zone, the associated camera's are selected and sent to the renderer.

Combat - Combat can also submit cameras. Whenever Kratos performs a grab on a grunt/minion he goes into a move sequence, which can each have their own animated cameras associated with them. More on this later.

Scripting - Finally there are scripted sequences. After the player triggers an event, be it pull a lever or whatnot, the scripting system will serve up to the camera system which specific cameras to use. These cameras would have normally already been picked out by the animators and are meant to highlight something important in the game.

Filtering
Once all cameras are submitted, the camera system is left with a laundry list of cameras to pick from. How does it narrow it down to the selections it wants? Using a filtering system.

Embedded in each camera is a list of other cameras it can or cannot transition to, so that's one pass. Then the player state is looked at, like if he is falling, only cameras that are marked to be valid when the player is falling will be used. Once they have that shortened list, they look at the priority level and clear out the ones with one less than that of the highest one in that list, or submitted. This finally returns a list of cameras that should be active in that scene. Next this list is passed on to be dealt with by the blending system.

Blending
The blending system handles how one camera will transition to another, how one camera will move or be restricted and blend in to create smooth and not-jarring movement. It keeps a track of how to do this using a blend tree at its core.

Blend Tree

As input the blend tree takes in camera priority, how it's set to blend and whether it was submitted that frame. Cameras can be faded in and out as needed and are inserted into the tree at the right point. Each of the nodes in the tree has a weight and mode that determine how it calculates what it will do that frame.

Blend weights have multiple ways of being generated. They can be created from timers which are used to drive a hermite spline, to add ease. They can also be driven by Kratos's position within a zone, like mentioned previously, which are calculated by the collision system. Finally they can be calculated bu the camera itself, and the team does have an experimental camera that controls its weight based upon its direction moving on a rail, more on this later.

Modes
There are two modes of blending that determine how the nodes will be blended in the tree. Crossfade blends the second node over the first node, and average mode blends across all input nodes using their individual weights to produce an average. Once these modes and weights are determined in the blend tree, they are reduced to a list of binary blending operations between pairs of cameras. Afterwards, each camera gets converted to a set of parameters that can later be interpolated to produce nice, smooth motion, which are: orientation, target position in world space and target position in camera space.

How the camera is positioned is described in the above image. The camera is set relative to the target which allows it to tween around it rather than being able to go through it. The actual position of the target in camera space is not defined by the targets position, rather it is stored as the offset of the target vector from the look vector (by horizontal and vertical angles). The distance from the camera to the actual target completes the spherical coordinate form.

A interesting thing to note is that the camera is stored in Euler form rather than Quaternion form. Eulers keyframe nicer for cameras rather than quaternions which introduce ugly roll. This however introduces the problem of gimbal lock, which they try and solve by asking designers to avoid those sorts of shots (vertical shots), but that is not really a realistic request. Designers will still submit those vertical shots, like Kratos walking down a flight of spiral stairs, and it is up to the programmers to solve it. They did it with a simple solution, which just involved tilting the camera just slightly from vertical. Simple and effective.

You never know what designers will throw at you however, like the vertical flying segment of God of War 3, . To solve this problem, they just added a extra transform which allowed them to change the actual direction of the gimbal lock, which should only be done under exceptional cases, and this was one of them.

The Yaw Problem
For each of the parameters, yaw, pitch and azimuth there are two ways that cameras can blend, long and short. For yaw it's a problem because the system is dynamic and whichever side is shortest can change uncontrollably, which can cause a unsightly pop as the camera tries to flip from one side of the player to another. This is easy to spot, but sometimes would pop up in a long chain of blends.

To solve this they decided to express yaw as a 2D unit vector which they blend with each blend in turn and converting it all back into an angle when generating the camera transform. This reduces the problem and gives them a warning which they can use to tell the camera designer if they are using fragile blends. All that's left now is angle of view.

Dynamics
Before we get to calculating the angle of view, lets look at how the cameras are generated in the first place. There are three basic cameras that match the previously discussed submission methods.

The animated camera is used for cinematic sequences, and is triggered by the scripting system. The dynamic camera deals with environmental situations and is controlled by the environment zoning system. Finally the combat camera activates when a close up shot of a fight sequence is needed and it is submitted by the combat system.

Animated
The animated camera simply takes a camera that's animated by a animator, typically for cinematic sequences and gets mapped directly to its blending parameters. You can even control the blending using a static aim point from Maya, as the target point. Of course more functionality was added in later games (GoW3) like the ability to drive animation based on a NURBS curve they called the drive rail. The drive rail would be mapped to the length of the current animation and the system automatically calculated the animation time based on the point on the rail nearest to the hero in each frame.

Dynamic
The dynamic camera model is similar to the model used for blending (pictured above, with all the vectors). All the same parameters come in as Euler angles again, but there are a couple of extra bits added.

You can see in the upper left the Dolly, and from the Dolly to the target plane, coincident with the look vector, the boom. The boom is used to calculate and constrain orientation and distance, and the offset is converted into distance to the target plane, to stop constraining the offset from changing the size of the target on screen.

Constraints are applied in the following order. First framing is set by constraining the horizontal and vertical components. Then the position of the dolly is updated to constrain the boom. Then the distance of the camera to the target plane is constrained, and finally the orientation of the camera itself. This all culminates in the framing constraint being expressed in terms of a safe zone on the screen that if the hero moves out of, the camera will put him back in.

After all of these calculations are completed in sequence, so now you are left with conflicts. Sometimes applying one constraint will inevitably push the system out of a another constraint. Minimizing the effect may work, but sometimes they oscillate, and in worse scenarios may cause the whole system to explode. This can be fixed easily though by opening up some constraints, which the speaker recommends be orientation, modified between 0 to 5 degrees.

Combat
Because combat events can happen anywhere in the environment, more direct control of the camera is seized (as these cannot be authored in the world), so they are animated relative to the characters involved.

In this scene are effectively three character, Kratos, the grunt and the camera rig. All three are synched such that a joint in each lines up at the position and orientation pictured above, called a synch joint. The combat rig itself is comprised of the synch joint at its root and attached to it are a camera and target, each on their own joints.

Because interiors can be liable to clipping with the camera a clever thing is done. The combat camera is blended over the environment camera, because it is know that the specific environment camera was designed not to intersect with geometry. The position of the environment camera is used, then the rig camera is oriented to face the target and the fov is set at a size that would be the same as if the rig camera was used directly.

If the target creatures are not facing the camera when an animation is initiated, they are rotated around the up axis to match the camera animation. This is tweened over time to look as smooth as possible, sometimes extra steps by Kratos are added to hide footslide on larger, more jarring angles.

Targeting
Targeting will cover how entity's are selected to be looked at.

Pictured above is Kratos. He stands at 2.25m tall, with his logical position being on the ground between his feet. About 1.5m up you can see the camera target, which has no explicit damping. All damping comes from player logic and will be described later.

Because Kratos is a jumpy sort of guy, this would cause the camera to be moved up and down a lot, and quite frankly that would be a little dizzying. There's a optional system called jump correction which sets out to fix that, and it can be enabled on a per camera basis.

The concept works simply. If Kratos lands on a higher surface than he started on, the camera will tween up to meet him. If he falls below starting height, his vertical position is immediately tracked. Notice no in-between; so when he actually jumps there is no vertical movement whatsoever.

Above is how targeting info is visualized in the camera system (in debug). If you notice, the creature target is a sphere rather than a point and it has a weight associated with it. The target can be also attached to a joint however and thus will need to be dampened.

This is achieved by wrapping the joint in two spheres. The innermost sphere allows free movement of the joint, the space in-between the two spheres is progressively dampened and if the joint reaches the outer sphere the whole target is moved to keep the joint inside. This makes sure that the target never becomes detached from the creature.

With multiple creatures in a fight, each with a target them, they are averaged to produce a boom target fed into the dynamic camera.To prevent slow moving creatures from going off screen, a weight is added to each target. The weight is actually comprised of three weights:

Base Weight - This specifies how important the target is
Distance Weight - This fades out between a min and max distance from the hero
Activation Weight - Used to smoothly add in or remove targets, when the creature is spawned or killed

The boom is obtained from the sum of all target positions, multiplied by their weights, and divided by the sum of all weights. This system works fairly well for a lot of creatures, but doesn't pull in close enough when down to one, singular grunt. This is a flaw in the averaging process and it is solved with a second pass of prioritised framing.

Prioritised Framing
Prioritised framing fixes the previous problem with a second pass that promises the highest priority target will remain in frame, followed by trying best to frame lower priority targets. It does this by tracking the target camera to include multiple tracking spheres. Starting at the bottom working its way up.

It first centers round Kratos, then looks for the lowest priority level, moving the frame only enough to encompass as much as possible. This is repeated at the next priority level until done, and whats left is the new frame. However there is still a problem, birth and death and going behind the camera.

When a character is spawned outside the frame, there is a sudden pop to include them in it. The solution is to add weights, a logical position between the actual, physical position of the target and the hero. This logical position is then used to tween targets in and out, and this logical position always remains in the safe zone. Whenever a new creature is spawned, the weighted position moves out from Kratos in smooth tween to its logical position, guiding the camera along with it.

Going behind the camera is a little more tricky. If the target were to pass behind the player, the camera would suddenly pop to the other side of the player, and pop back next frame in oscillating mess. The solution is pretty similar to the birth and death solution.

The logical weight never goes behind the camera, instead it is clamped to the plane parallel to the camera plane. In the image above you can see the target, represented by the solid circle, and it's logical wieght, represented by the dotted circle. Notice how the target is now behind the camera, but the logical weight does not pass by the horizontal line (plane), and that is what the camera is tracking. As long is follows that, the target will eventually reunite with its weight and the transition will look smooth, no jerky flips.

All these corrections come together to make a camera system that works smoothly in all cases, keeping priority on the player and including all relevant enemies in a way that looks smooth and natural.

All images and source material was taken from Phil Wilkins, 2011 GDC presentation, "Iterating on a Dynamic Camera System".

Tuesday, 10 December 2013

Scripting with DSLs

In class we were shown a GDC talk from Mike Lewis about DSLs and AI, how to build them and integrate them into your game. Let's start with what they are.

What is a DSL?
DSL stands for Domain Specific Language and it is simply a language designed around solving a specific set of problems. On the continuum of programming languages, that range from the most generic and widely used, like C++, Java to the really specific, like SQL or Verilog you can design your DSL to be as specific as you like. Some languages are more specialized than others, some are designed to execute in specific environments, like server-based languages, and others are intended to be as widely applicable as possible.

What do they look like then?

Above is a code sample from Mike Lewis's presentation describing a fragment of a cutscene. As you can see it bears some resemblance to web-based languages like HTML or query based language like SQL so they are easy to understand and create.

DSL's resemble declarative file formats although typically remaining imperative. This not necessarily mean your DSL cannot be declarative, and there is nothing really wrong with it. DSL's can adapt to change and you can formulate them to do so.

Why are DSL's useful for AI?
There are a few core principles or tenants to keep in mind when designing a DSL that will help ensure you design it properly.

#1 - Code Cleanliness
Code cleanliness needs to be stressed here. You need to design the code in a way that the people who will be implementing your DSL think about solving the problem. Your DSL design needs to be specific enough to the problem at hand, so that it can be mapped directly onto the problem. Your code will be cleanest when implemented in the same terms that designers think about solving the problem.

#2 - Eliminating Irrelevant Concerns
Eliminate all problems that do not directly apply to the problem we need to solve. Only focus on the major concerns, looking at the big picture. Doing this can minimize bugs and save valuable time in debugging.

#3 - Remove Excess Overhead
Productivity can be maximized by removing details and features that don't pertain to the problem at hand. Strip away all the excess and drill down to the core of what you're trying to solve without worrying about the clutter of other less pertinent issues.

What makes this useful?
AI work typically consists of solving a limited amount of problems, using already tried and true techniques. Techniques that you can implement in your own fashion, but basically solutions that have already been though out for you , for your problem.

What are the perks of using DSL's?
Not having to use C++ allows us some serious benefits. There is no longer a need to manually manage memory, which can be a difficult thing for Jr. Programmers. There is also no need to learn complex syntax, as the syntax is designed by yourself to be as logical and plainspoken as possible. Finally you avoid most complications like undefined behavior which can occur in programming AI, something that makes the programming of AI so difficult.

Overall it makes the code more accessible to those not as adept, like designers and junior coders, reduce the chance to cause severe, game-breaking bugs in code, and be more productive. Freedom from C++!

There are common misconceptions when it comes to DSL development, like compilers being overly complicated, learning to use a interpreter, worrying about garbage collection etc., but this is not the case. DSL's are, by definition, designed to simple and to the point. They omit the major issues the concern the more generic and massive languages of C++, and require far less infrastructure to be deployed effectively.

You don't need all those complex parsers, compilers or virtual machines or even adherence to regular syntax. DSL's are and should be easy! The syntax is simple and intuitive, something you would logically say in pseudo code. You can even use existing file formats like XML or JSON without the need for the extraneous compiler layers. The problem space is limited and execution code need not be complex, you are developing for a limited problem after all. You can even grab existing data driven code that's already present in your engine and build on that.

The fundamental goal then, is to take all the data and logic out of core engine code and move it into the DSL. You are essentially just taking data driven programming to the next logical level, isolating it in its own language, its own problem space. The core thing to remember is, you want to do the simplest thing that can possibly work, without excess complexity.

How do we build and deploy a DSL?
The process of building and implementing a DSL can be broken down into four major chunks which involve identifying core problems, selecting algorithms and solutions, breaking down solutions and implementing the concepts.

Core problems for AI can be things like pathfinding, whether it be nav mesh or goal oriented, responding to dynamic game situations once they happened. Anticipating dynamic situations is also tricky as you need to plan for the unpredictability of human players. So what kind of solution do we choose?

As per our previous statement we don't need to do anything special or extra, as we want to keep the solution simple. We choose what we would normally implement anyways or as the design requires. To determine what the design requires though, we need to break it down into common concerns and issues, and decide what decision making logic we will be using. Of particular importance when planning out AI are branch points. Branch points will essentially be where DSLs should be placed.

There is no need to make code over complicated to read and understand and we can apply this methodology of thinking to designing the syntax. For example if we wanted to instruct the AI to begin pathfinding we could tell it, find_path(), or reading in a state, get_player_health(), or doing computations such as: calculate_damage().

Implementation
To implement the DSL we must remember first and foremost, to keep syntax simple and clean, and if not use existing file formats like XML or JSON. Syntax needs to be clean enough to parse very, very easily using again just regular expressions, almost things that you would say in conversation. A good example of this can be seen below:

The above code is much easier to parse than standard C++ syntax. It maps the logic directly onto the right concept and in doing so minimizes potential for bugs due to your understanding and clarity of the command. This in essence is what the implementation of a DSL is trying to achieve, something that could be understood by those not familiar with computer code.

Who benefits from this?
Basically everyone! Programmers can enjoy a much safer environment to work in, worrying less about debugging and more on actual development. This same idea allows junior members, or those not to comfortable with the code, to do critical AI work and contribute more on the project. It even enables those that would not originally program, like designers to write logic directly themselves, saving the time of moving the code back and forth between designer and programmer, cutting out the middleman.

When it comes to finally reviewing the code, it makes it easier for debuggers to understand what you're getting at and it means that there will be a faster turnaround time between QA and code improvements.

Remember, to have DSLs working properly in your project you need to adhere to a few core pillars. Keep the design of the code as close to the design of solving the problem as possible, so when you go to map it, it does so exactly. If all done correctly the process will be easy and rewarding, and you'll have yourself a great working AI.

Saturday, 7 December 2013

Data Structures and The Scene Graph

Engines are essentially large vessels for storing and manipulating date, then doing some magic behind the scenes and creating a pretty game. But what's behind the curtain, how do ones and zero's turn into Nathan Drake searching for untold treasure? Well it starts with data, and how it's contained and organized (for games).

Let's start off with the standard template library.

Vectors
Don't get these confused with mathematical directional vectors, these are containers of data of a certain type. They are like arrays, but dynamic which means you can add elements in and remove and it will automatically resize for you. Very handy when storing an unknown amount of things like at runtime like a players inventory, or entity's created in the world. They also come with sorting functions built in so they can be used as ordered lists.

They are used (at least by myself) almost 90% more than arrays, except for when I need ordered stability and I know beforehand the amount of items in the array. Vectors run the risk of ruining the order of your items if you accidentally insert elements and may be slower than arrays because every time you resize it has to allocate reserve memory, which can slow down processing, rather than just directly accessing elements yourself.

Queues
Queues maintain order and a dynamic line or list. You can only add or remove elements and when you add elements, they get added to the back, and when removed, they are removed from the front. The entire time order is maintained and elements from the back eventually get shuffled to the front once you remove all items from them.

Useful applications can include message handling queues, network traffic queues, basically anything that requires sequential order be preserved.

Stacks
Stacks work similarly to queues but instead of maintaining a line, they maintain a stack of items. You can have quickest access to whatever element you put in most recently and slowest access to the element you put in first. To access the first element you put in, you need to remove all elements above it. So what could this be used for?

Stacks can used for things like script parsing, programming AI and handling memory, or how your game allocates and then access's memory (reverse order for accessing). This would be useful for when running your program to closing, especially with memory allocation for classes and their instantiated objects. Memory needs to be cleared in reverse order that it was allocated when closing your program to ensure all allocated blocks are cleared, otherwise you could be left with numerous memory leaks.

Graphs
Graphs are like binary trees, without the binary restriction. Each node can have as many children or parents as it wants (depending on the type of graph), although usually each node has more than one child. This can be applied to level layouts or searching through a maze. If you use a breadth first search you can make decision trees that require more than two binary choices. This way it can act as the backbone of your text based adventure game.

The Scene Graph
Now onto the scene graph. The scene graph is the core of Ogre's rendering engine (and others) and manages everything that's loaded in the game. What is does in its core is specify a relationship between objects or nodes and the game forming a hierarchy of every entity in the form of, you guessed it, a graph.

This is essential for managing a game as multiple items can be attached to one node, like geometry, a light, entity etc., and it all starts at the root node. Any object that you want to be rendered on screen needs to be attached to the root node, become a child of the root node, and be in view of the camera, attached to the same node. Typically of this is handled through the use of scene manager which can be used to lookup nodes in the scene graph and store them in node pointers for easier access later.

It basically acts as the core of the engine and through its design enables a lot of powerful functionality. Through the inbuilt hierarchy the quick organization of objects and primitives allows you to group things like transformations and effects like shaders which makes animation of items with a complex graph of nodes easy and efficient. Overall, the scene graph's primary job is to feed geometry to the renderer which happens in several stages.

To figure out which items to draw and and which to omit is one of the core tasks of the scene graph. It's not good to just render everything at all times because that consumes precious memory and cpu/gpu clocks. A step up is to determine if those objects that are only visible to the player, that is in the frustum of camera, are there and all objects that are out of this space or not rendered or culled.

Some objects share things like shaders and every time you need to switch contexts in the renderer it causes a time delay. To help make this process more efficient you can group items together based on a common render state (like a shared shader) and draw all of those in the same pass (if they are in the frustum of the camera). This is a step up but your missing one more thing, regarding geometry and spatial coherency.

To correct this we use a potentially visible set which is a way to precompute sets of potentially visible polygons based on regions of space and indexing those sets at runtime so they can be quickly accessed by the rederer. This makes computation of rendering much cheaper as it is just a lookup rather than computing occlusion every frame. This however does increase storage for PVS data and may increase preprocessing times. After the PVS is created you sort it based on the render state and then do a set of standard passes.

This includes the geometric update which updates the spatial in the hierarchy like transformations, topology changes. Then you pass it off to the render state update where you attach your global state or shader before you hand it off to your culling pass. The culling pass takes it and determines which PVS to use and finally hands that off to the drawing pass which will draw it out. Of course in all of this it is possible to draw invisible geometry, or geometry that the player will not see in that frame but it is so far one of the most efficient ways to determine what gets rendered and it's the way Ogre computes it (so it's the way we'll be doing it during development of our games).

Wednesday, 4 December 2013

Design Patterns

Design patterns are OO programming patterns that are used frequently by programmers to address specific ways of tackling programming problems. There are a great many benefits to using design patterns when formulating your code. It's a problem that's already been solved for you by software designers, they are not specific pieces of code but rather ways of coding so they translate across a great many languages.

This kind of compatibility is rather useful for gaming where hundreds of programmers work on assembling a single product. This may introduce different styles and methodologies, some which may clash with one another. This kind of phenomenon could be especially present between different languages on the same project , like the people in charge of server/network code don't agree with those coding AI logic, or standard toolsets. If the team agrees on specific design patterns for implementing their various sections of a game then it can aid in debugging, make communicating ideas easier and save loads of time. Of course, it's not all positives.

There are drawbacks to using design patterns, especially when it comes to creative problems. A specific design pattern may not be best suited for one particular problem, and may actually be inflating the amount of work you really need to do. They may limit your creativity when it comes to figuring out a solution, and that does not bode well for games. Games are unique problems that usually require a personalized approach. For all the good they provide, design patterns are not the only way to solve the problem. You have to take into consideration that it just may be easier to solve it yourself rather than using a design pattern.

Singleton
The singleton pattern is your answer to global variables and declaring things globally which could be very dangerous. It ensures that only one instance of a class is created when needed. Where would this come in handy? In game development engines are a collection of interlocking systems working in tandem. You have your graphics system and it has a renderer, or window manager. You only want one renderer generally as it controls the flow of information pertain to graphics, which is of particular importance when it comes to games and controls the largest portion of memory (meshes, textures). Having multiple instances would not only eat up memory quickly, but also create a huge mess in code and for the compiler.

Using the singleton will ensure that you have only one instance of a class, the particular one you want to access, with a global point of access so all your subclasses can access the data. This is especially useful when you have classes outside the Gamestate that want to access information in the game state (like physics system calculating one frame before passing it off to the renderer), when your gameState class is the central one, the one that holds the core update function that you need to be able to send information easily.

Facade
The facade pattern at its core is a manager class. It is an interface or a manger to a large collection of related subclasses. Within the facade the details of the subsystems are hidden away, so as to simplify the interface for the user. You can have multiple facades interacting with one another, and this enables the management of the subsystems to be less tedious, by reducing all of the interdependencies you would otherwise have.

Facades are useful for doing things like creating an interface to a collection of classes making up a software library. This can reduce the dependencies of external code on the inner parts of the library, instead just interacting with the facade, providing more flexibility. It can also be used to wrap a poorly designed collection of API's to suit your needs, for gaming this could be wrapping useful external libraries that do things like texture or sound manipulation, or some function that is not inherent to your engine already. If you really think about it,a good analogy could be you are the client, and the computer is a facade. You interact with the facade which contains subsystems that control things like processor, memory allocation, hard drive etc.

State
State patterns are a familiar beast. For the first couple of years in gamedev we relied on state patterns to control how our game transitioned from things like the main menu, to gameplay, to other levels. They are easy enough to implement at first, using switches and enums, because they are logically thought out. As you add states though, things get out of hand and it can become a maintenance nightmare. So how do you fix it?

Use objects to represent logical states! That's right, when in doubt just abstract it to a class. A manager class , a state manager specifically. The manager can take over all the state logic and you can share functionality between states through inheritance. This could be useful in the future when we look into implementing AI systems, and controlling switching for behavior control. Best of all it gets rid of all the conditional checks and switch statements that pollute the codebase!

Factory
The factory pattern deals with the creation of objects, and it is present to some extent in our game already. The idea is to have one central class that can create instance of multiple objects, all you need is to pass in an id of some sort. This is present in my gui when creating widgets for an overlay. You create an object of type widget and then in parameters specify what kind of object it is (be it textbox, or button) and that specific object gets created and attached to the pointer for a widget.

This can provide great extensibility, by allowing you to add new functionality without changing much code. If you are using containers to keep a track of certain objects, like a overlay and all of its parts, it's much easier to use the factory pattern because during development the design varies and goes through constant iteration. This change does not affect an implementation of the factory pattern and will save you precious time and headaches without having to worry about all the specifics.

So are design patterns good? It depends on the situation you are using them for. Usually they can help inspire your solution or get you on the right track providing effective solutions to common problems; and if you ever need to take a different approach, you're always free to deviate.