ttm

TTM for developers

Background and Naming

TTM is short for Translation Table Maps which was originally a new DRM map type that could be flipped into and out of a general translation table; for example an AGP or PCI Gart. Since then it has, a bit misleading, become the name of the functionality of the TTM unified memory manager for display memory. A white-paper describing the TTM memory manager can be found here. The original implementation was by many considered too complex and with too little flexibility. An update / reorganization has been done to correct most if not all of the issues people were having.

Purpose

The purpose of the TTM implementaion is to provide tools for buffer object placement, caching, mapping and synchronization. Placement could for example be system memory, pre-bound AGP memory or dynamic AGP memory. The implementation also handles cache coherency automatically by mapping memory in uncached- or write-combined placement using the correct attributes. Buffer object data can be mapped to user-space with a single virtual address that doesn't change even if the placement of the buffer object changes. TTM can also optionally handle synchronization on a per-buffer-object level and provides an optional read-write-lock replacement for the global DRM lock. It's designed to minimize lock contention in the presence of multiple rendering clients and command fifos. TTM does not attempt to handle all kinds of GPU mapping. For simple hardware, the placement functionality is enough to give a unique GPU virtual address for a buffer in VRAM or AGP space. For a device with multiple memory contexts and GPU paging, TTM will handle placement and CPU mapping, whereas the driver itself will need to set up the GPU maps.

Relation to GEM

In the following text it's important to distinguish between the GEM users-space API, the GEM core implementation and the Intel device-specific implementation backing this API. TTM can be used as a backing implementation for the GEM user-space API, which, together with the GEM core implementation is simple and straight-forward. However, TTM may also be used with a separate user-space interface for buffer object handling. Which one to choose is really up to the driver writer, and the driver writer can easily set up a unique user-space API.

The advantages of the TTM implementation compared to the implementation backing GEM for the Intel drivers are: The TTM implementation

Drawbacks of the TTM implementation compared to the Intel GEM implementations are mainly:

Components

The TTM code is now structured as a library of functionality that a driver writer can use. The following components currently exists:

Typically the TTM lock is taken in read mode before buffers are validated. For any of the above three conditions, it is instead taken in write (exclusive) mode. The TTM lock implementation lives in ttm_lock.h and ttm_lock.c. Like the other components, the TTM lock is a free-standing implementation and neither buffer objects nor fence objects depend on it. However, as noted above, it derives from the TTM user base object.

Execbuf utilities

The execbuf IOCTL is usually quite driver-specific, however there are some functions to aid execbuf driver writing; mainly to reserve buffers for validation in a deadlock-safe and starvation-safe manner, and to put sync objects on those buffers or back off validation in case of errors or interruption. These utilities are located in ttm_execbuf_util.h and ttm_execbuf_util.c. The driver writer may choose whether to use these utilities.

AGP TTM backend

For drivers using traditional AGP memory there's an AGP TTM backend implemented in ttm_agp_backend.c. This backend supports multiple AGP bridges, but the driver needs to know what AGP bridge it sits on. The backend calls into the Linux agpgart kernel API. Drivers with other types of GPU MMU will have to write their own backends for that particular MMU. However, note that hardware with advanced and perhaps per-context MMUs that create GPU views into the data only will probably not want to use the TTM backend mechanism, but a customized bind / unbind procedure. For those drivers TTM should be used to place data in a suitable memory type and with suitable caching attributes prior to setting up the GPU maps. If we take the Intel i965 MMU system as an example, TTM with the AGP backend would be used to place buffers either in VRAM (stolen memory) or write-combined the global GTT aperture for mapping and perhaps for shared buffers. The i965 also has per-memory-context GTTs, the mapping of which is visible to the GPU only. A driver could allocate a private GTT page table per client and let the client manage its private GTT in user-space. The actual binding to a private GTT would then be done by the driver as part of command submission, and the driver would call into TTM to make sure that the buffer objects are in a bindable state: reciding in system memory and write-combined.

User-space components

The TTM user-space API is quite small and a client can and may use the IOCTLs directly. However, there is a general user-space buffer manager implementation, libwsbm, with backends that interface to the placement user object and fence user object implementations.