[RFC PATCH 00/16] Add a TTM shrinker
Thomas Hellström
thomas.hellstrom at linux.intel.com
Wed Feb 15 16:13:49 UTC 2023
This series introduces a TTM shrinker.
Currently the TTM subsystem allows a certain watermark fraction of
system memory to be pinned by GPUs. Any allocation beyond that will
cause TTM to attempt to copy memory to shmem objects for possible
later swapout so that that fraction is fulfilled. That unnecessarily
happens also on systems where swapping is not available, but still
works reasonably well in many cases.
However there is no way for the system to swap out all of graphics
memory even in situatons where graphics processes are suspended.
So add a TTM shrinker capable of moving graphics memory pages to the
swap cache for later laundring and free, and, in the case there is no
swap available, freeing graphics memory that is kept around for
caching purposes.
For devices where the shrinker is active, the watermark
fraction is disabled, but for devices not (yet) supporting shrinking
or using dma_alloced memory which we can't insert into the swap-cache,
keep it around.
Each driver needs to implement a callback to enable the shrinker for
its devices. Enable it for i915 as a POC. Will also be used by the
new Intel xe driver if accepted.
The parts of the series mostly needing consideration and feecback is
*) The mm part, inserting pages into the swap-cache. Is it acceptable and,
if so, correct? It *might* be possible we can do without this part,
but then we'd have to be able to call read_mapping_page() and
trylock_page() on non-isolated shmem pages from reclaim context,
and need to be able to recover from failures.
*) The TTM driver callback for shrinking
*) The additional TTM functions to mark buffer-objects as not needed, but
good to have around for caching purposes.
*) Swapin doesn't lose content on error and is also interruptible or at
least killable ATM. This complicates helpers. Should we
drop this and just drop content on error, and wait for swapin
uninterruptible? The TTM pool code could indeed do without additional
complication...
*) Is there a better way to do shrink throttling to avoid filling the
swap-cache completely.
*) Is it good enough for real-world workloads?
The series has been tested using the i915 driver with a 4GiB
VRAM DG1 on a system with 14GiB system memory and 16GiB SSD Swap, and using
an old igt-gpu-tools version, 8c0bb07b7b4d, of gem_lmem_swapping
which overcommits system memory quite extensively
Patch walkthrough:
Initial bugfixes, could be decoupled from the series.
drm/ttm: Fix a NULL pointer dereference.
drm/ttm/pool: Fix ttm_pool_alloc error path.
Cleanups and restructuring:
drm/ttm: Use the BIT macro for the TTM_TT_FLAGs
drm/ttm, drm/vmwgfx: Update the TTM swapout interface
drm/ttm: Unexport ttm_global_swapout()
Adding shrinker without enabling it:
drm/ttm: Don't use watermark accounting on shrinkable pools
drm/ttm: Reduce the number of used allocation orders for TTM pages
drm/ttm: Add a shrinker and shrinker accounting
drm/ttm: Introduce shrink throttling
drm/ttm: Remove pinned bos from shrinkable accounting
drm/ttm: Add a simple api to set/ clear purgeable ttm_tt content
Adding the core mm part to insert and read-back pages from the swap-cache:
mm: Add interfaces to back up and recover folio contents using swap.
TTM helpers for shrinking:
drm/ttm: Make the call to ttm_tt_populate() interruptible when faulting.
drm/ttm: Provide helpers for shrinking.
drm/ttm: Use fault-injection to test error paths.
Enable i915:
drm/i915, drm/ttm: Use the TTM shrinker rather than the external shmem pool
Any feedback greatly appreciated.
Thomas
Cc: Andrew Morton <akpm at linux-foundation.org>
Cc: "Matthew Wilcox (Oracle)" <willy at infradead.org>
Cc: Miaohe Lin <linmiaohe at huawei.com>
Cc: David Hildenbrand <david at redhat.com>
Cc: Johannes Weiner <hannes at cmpxchg.org>
Cc: Peter Xu <peterx at redhat.com>
Cc: NeilBrown <neilb at suse.de>
Cc: Daniel Vetter <daniel.vetter at ffwll.ch>
Cc: Christian Koenig <christian.koenig at amd.com>
Cc: Dave Airlie <airlied at redhat.com>
Cc: <linux-graphics-maintainer at vmware.com>
Cc: <linux-mm at kvack.org>
Cc: <intel-gfx at lists.freedesktop.org>
Thomas Hellström (16):
drm/ttm: Fix a NULL pointer dereference
drm/ttm/pool: Fix ttm_pool_alloc error path
drm/ttm: Use the BIT macro for the TTM_TT_FLAGs
drm/ttm, drm/vmwgfx: Update the TTM swapout interface
drm/ttm: Unexport ttm_global_swapout()
drm/ttm: Don't use watermark accounting on shrinkable pools
drm/ttm: Reduce the number of used allocation orders for TTM pages
drm/ttm: Add a shrinker and shrinker accounting
drm/ttm: Introduce shrink throttling.
drm/ttm: Remove pinned bos from shrinkable accounting
drm/ttm: Add a simple api to set / clear purgeable ttm_tt content
mm: Add interfaces to back up and recover folio contents using swap
drm/ttm: Make the call to ttm_tt_populate() interruptible when
faulting
drm/ttm: Provide helpers for shrinking
drm/ttm: Use fault-injection to test error paths
drm/i915, drm/ttm: Use the TTM shrinker rather than the external shmem
pool
drivers/gpu/drm/Kconfig | 11 +
drivers/gpu/drm/i915/gem/i915_gem_object.h | 6 -
.../gpu/drm/i915/gem/i915_gem_object_types.h | 6 -
drivers/gpu/drm/i915/gem/i915_gem_pages.c | 5 +-
drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 273 ++-------
drivers/gpu/drm/i915/i915_gem.c | 3 +-
drivers/gpu/drm/ttm/ttm_bo.c | 45 +-
drivers/gpu/drm/ttm/ttm_bo_vm.c | 19 +-
drivers/gpu/drm/ttm/ttm_device.c | 85 ++-
drivers/gpu/drm/ttm/ttm_pool.c | 522 ++++++++++++++++--
drivers/gpu/drm/ttm/ttm_tt.c | 336 +++++++++--
drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 3 +-
include/drm/ttm/ttm_bo.h | 4 +-
include/drm/ttm/ttm_device.h | 36 +-
include/drm/ttm/ttm_pool.h | 19 +
include/drm/ttm/ttm_tt.h | 57 +-
include/linux/swap.h | 10 +
mm/Kconfig | 18 +
mm/Makefile | 2 +
mm/swap_backup_folio.c | 178 ++++++
mm/swap_backup_folio_test.c | 111 ++++
21 files changed, 1361 insertions(+), 388 deletions(-)
create mode 100644 mm/swap_backup_folio.c
create mode 100644 mm/swap_backup_folio_test.c
--
2.34.1
More information about the dri-devel
mailing list