walbourn: (Default)
walbourn ([personal profile] walbourn) wrote2004-03-29 05:07 pm

GDC 2004 Trip Report Day 1



DirectX Developer Day Tutorial

"Windows Best Practices"
The day started with a presentation on "best practices" for Windows development, with much of the content being repeated from the last Dev Day I attended. Most of the issues revolve around incorrect assumptions for Windows XP, particularly in directory locations, assumed security access, subtle differences in memory manager and message order/processing, and implications of running with Fast User Switching since this only occurs in non-domain configurations of XP.

It was suggested that applications check for the WHQL bit on the driver and warn the user of potential issues if it is not set. I've heard from others this operation can be very slow, so it might not be suitable for use on start-up every time. Still, MS is concerned about people using drivers that haven't passed through the WHQL program and customers blaming the application or OS for problems that actually occur in the driver.

Another suggestion was to get shipped game executables signed with a Versign ID, named something meaningful (i.e. not GAME.EXE), and containing a valid version resource record so that the Windows Error Reporting data (i.e., when a user chooses to 'Send' crash results to MS) can be made more meaningful. This requires that the developer keep an archive of the matching symbols for the release build and source so they can more quickly resolve these crashes based on the limited crash dump information provided by this system. Access to this data is handled through Microsoft developer contacts.


"Windows Performance and PIX"
Profiling graphics applications is difficult on the PC due to buffering in the D3D runtime to minimize cost of user/kernel mode switches. MS developed a frame-based custom profiling tool for the Xbox, called PIX, and because of its usefulness they have started a version for Windows. This tool, which is in the DirectX 9 Summer 2004 update, supports a plug-in architecture so that vendors can provide more detailed h/w specific performance data and for games developers to add their own custom performance counter information. The talk gave guidelines for profiling, and other potential sources of false performance data.

The remainder of the talk discussed potential areas of performance issues with D3D titles. Often the performance issues are not graphics related, but instead related to I/O such as loading, memory paging, or texture thrashing and occasionally related to API misuse. Current D3D architecture works best with medium-sized batches presented as indexed triangle lists with vertices sorted for h/w cache coherency. Small batches cause very high overhead in the user/kernel switch, and large batches will cause CPU performance hits when copying commands to the driver. The Summer 2004 update version of D3DX has a version of the cache-coherency optimizer that works directly on index-buffers and vertex-buffers directly instead of requiring D3DX structures. On PC, it is important to do this optimization based on the current video h/w in the machine, either by doing this at load-time or by detecting a video-board change to optimize the data for reuse.

Other graphics techniques to consider: A z-prepass to prime the Z-buffer for better pixel-rejection, scene presented in front-to-back order (only works for opaque objects), sorted batches per material/effect and render state, and organizing the engine such that specific ordering logic can be easily modified and adjusted later in the profiling stages.

Other non-graphics techniques to consider:
  • I/O: on-demand loading with stand-in graphics resources, use sequential file processing optimizations (and API hint flags) including potential duplication of data on disk, avoid opening/closing files repeatedly or opening many files at once even infrequently
  • DLL: 'delay load' linker setting, disabling thread library calls for the DLL thread-attach messages, statically initializing DLL global variables using 'const' to ensure they are put into the read-only section
  • Compiler: Use highest level warnings including 64-bit compatibility warnings, use CPU-specific compile flags and best optimization options, and consider disabling exception handling if you don't use it
  • VMM: be careful about relying too heavily on the virtual memory paging as this operation is slow



  • "Managed Code in Gaming"
    This presentation discussed the issues of using 'managed code' for games development via languages such as Microsoft's C# and the Common Runtime Libraries. The focus of the talk was on the benefits of managed code, and suggested using it for primary development, as a scripting language, and tools. Issues of working with the garbage collecting memory manager to avoid performance penalties were explored as well. DirectX is fully exposed as managed code, and managed code can access non-managed code through a standardized mechanism.

    While I believe 'managed code' languages do have benefits for rapid development, debugging ease, type safety, and code access security as suggested by this presentation, its usefulness is limited. The CLR is not yet fully deployed, so its use would imply additional installation issues, and is only usable on Microsoft Windows PCs. From conversations I had later during the conference, 'managed code' is being used widely for developing internal tools and possibly even end-user tools on the PC, but its lack of portability and the performance implications makes it unsuited for use in game engines or even as a scripting solution.


    "Microsoft VS/PS 3.0"
    The new DirectX 9.0 Summer Update 2004 will include a new revision of the vertex shader and pixel shader models, version 3.0, and a matching version of the High Level Shader Language. It includes more instructions (dealing with FSAA centers, texture sampling within the vertex shader, and computation of gradients for positions/ texture lookups), more registers, and is fully inclusive of previous shader models to be implemented by newer hardware coming onto the market. The new 3.0 model requires using 3.0 for both the vertex and pixel shader, where-as the older models allowed mix & match of versions.


    "Preshaders"
    The Summer Update .fx file system includes a new 'preshader' model, which is utilized by the HLSL 3.0. A preshader consists of instructions that result in constants that can be precomputed on the CPU (via a software shader emulator). This is mostly useful for rapid development of shaders and finding all constant code within a shader. D3DX includes routines that do both effect routine compilation and disassembly.


    "Effects Performance"
    This talk focused on performance issues using the .fx effects system, which was enhanced and somewhat reworked for the Summer 2004 update. The implied rules for when states are updated was changed to explicit usage of a new CommitChanges call, Pass was renamed BeginPass to match new EndPass call, shared variable support was optimized, and the new preshaders were added.


    "Integrated D3DX Effects"
    This talk was presented by a D3D developer that made extensive use of the .fx effects system in their engine, making extensive use of meta-data with the .fx files for integration with their tool UI.


    "Games, Firewalls, and XP SP 2"
    The existing firewall solution in Windows XP is problematic, difficult to use, and off by default. Windows XP SP 2 will include a new version of the Microsoft Firewall that makes some fundamental changes, and will be on by default. For Client/Server model games, this firewall should have no impact. For Peer-to-Peer games and for game servers, the firewall will block the traffic since there are 'unsolicited' connections made. To bypass the firewall, the system can be configured on an executable-by-executable basis to disable the 'unsolicited' traffic block, which can be done by a new configuration API, by the user in a control panel, or in response to a pop-up the system creates for an application trying to receive unsolicited traffic. Even in this case, it is possible for the system to be set to 'no exceptions' mode to block all unsolicited traffic as a global setting-like when a user is in an unsecured network such as those in an Internet Café.

    Applications should take steps to make any required firewall configuration during installation time, check for 'no exceptions' mode and other firewall status to disable in-game multiplayer UI, and should properly handle losing focus for the pop-up possibility.


    "Pre-computed Radiance Transfer"
    This talk demonstrated recently introduced realistic lighting techniques supported via the D3DX library: sphere harmonics (SH) and pre-computed radiance transfer (PRT). These techniques work very well for static models, although some recent work is trying to extend the usefulness of these techniques for moving objects.


    "DirectX Graphics Future"
    The day wrapped up with a discussion of future directions for the DirectX API. Developers are requesting many new features, but also want better performance. The current architecture suffers from issues with small batch sizes--which are natural for complex material usage--due to the driver model and mapping to the h/w. The Summer Update 2004 does add some additional functionality to help work around the small batch size issue, but resolving this will require a fundamental rework of the driver model. This remains a fundamental performance issue with DirectX and makes it difficult to get full utilization of the GPU.

    The DirectX API is a fundamental part of the Longhorn version of Windows, and work is being done to re-architect the driver model as part of the new version. Longhorn will rely heavily on Direct3D for the UI. This new version of Direct3D will remove the fixed-function pipeline, completely rework the driver interface to minimize overhead and h/w mapping overhead, virtualize the video and GPU resources, merge the vertex and pixel shader into a single unit, remove nearly all capabilities bits to give a more unified model for s/w, and make many more improvements to the entire graphics pipeline. This work is likely at least 1 or 2 years away from release, so DirectX 9.0 (possibly with more minor updates) will be in place until then.