GPU Power/Thermal Controls and Monitoring — The Linux Kernel documentation (2023)

HWMON Interfaces

The amdgpu driver exposes the following sensor interfaces:

  • GPU temperature (via the on-die sensor)

  • GPU voltage

  • Northbridge voltage (APUs only)

  • GPU power

  • GPU fan

  • GPU gfx/compute engine clock

  • GPU memory clock (dGPU only)

hwmon interfaces for GPU temperature:

  • temp[1-3]_input: the on die GPU temperature in millidegrees Celsius- temp2_input and temp3_input are supported on SOC15 dGPUs only

  • temp[1-3]_label: temperature channel label- temp2_label and temp3_label are supported on SOC15 dGPUs only

  • temp[1-3]_crit: temperature critical max value in millidegrees Celsius- temp2_crit and temp3_crit are supported on SOC15 dGPUs only

  • temp[1-3]_crit_hyst: temperature hysteresis for critical limit in millidegrees Celsius- temp2_crit_hyst and temp3_crit_hyst are supported on SOC15 dGPUs only

  • temp[1-3]_emergency: temperature emergency max value(asic shutdown) in millidegrees Celsius- these are supported on SOC15 dGPUs only

hwmon interfaces for GPU voltage:

  • in0_input: the voltage on the GPU in millivolts

  • in1_input: the voltage on the Northbridge in millivolts

hwmon interfaces for GPU power:

  • power1_average: average power used by the SoC in microWatts. On APUs this includes the CPU.

  • power1_cap_min: minimum cap supported in microWatts

  • power1_cap_max: maximum cap supported in microWatts

  • power1_cap: selected power cap in microWatts

hwmon interfaces for GPU fan:

  • pwm1: pulse width modulation fan level (0-255)

  • pwm1_enable: pulse width modulation fan control method (0: no fan speed control, 1: manual fan speed control using pwm interface, 2: automatic fan speed control)

  • pwm1_min: pulse width modulation fan control minimum level (0)

  • pwm1_max: pulse width modulation fan control maximum level (255)

  • fan1_min: a minimum value Unit: revolution/min (RPM)

  • fan1_max: a maximum value Unit: revolution/max (RPM)

  • fan1_input: fan speed in RPM

  • fan[1-*]_target: Desired fan speed Unit: revolution/min (RPM)

  • fan[1-*]_enable: Enable or disable the sensors.1: Enable 0: Disable

NOTE: DO NOT set the fan speed via "pwm1" and "fan[1-*]_target" interfaces at the same time.

That will get the former one overridden.

hwmon interfaces for GPU clocks:

  • freq1_input: the gfx/compute clock in hertz

  • freq2_input: the memory clock in hertz

You can use hwmon tools like sensors to view this information on your system.

GPU sysfs Power State Interfaces

GPU power controls are exposed via sysfs files.

power_dpm_state

The power_dpm_state file is a legacy interface and is only provided forbackwards compatibility. The amdgpu driver provides a sysfs API for adjustingcertain power related parameters. The file power_dpm_state is used for this.It accepts the following arguments:

  • battery

  • balanced

  • performance

battery

On older GPUs, the vbios provided a special power state for batteryoperation. Selecting battery switched to this state. This is nolonger provided on newer GPUs so the option does nothing in that case.

balanced

On older GPUs, the vbios provided a special power state for balancedoperation. Selecting balanced switched to this state. This is nolonger provided on newer GPUs so the option does nothing in that case.

performance

On older GPUs, the vbios provided a special power state for performanceoperation. Selecting performance switched to this state. This is nolonger provided on newer GPUs so the option does nothing in that case.

power_dpm_force_performance_level

The amdgpu driver provides a sysfs API for adjusting certain powerrelated parameters. The file power_dpm_force_performance_level isused for this. It accepts the following arguments:

  • auto

  • low

  • high

  • manual

  • profile_standard

  • profile_min_sclk

  • profile_min_mclk

  • profile_peak

auto

When auto is selected, the driver will attempt to dynamically selectthe optimal power profile for current conditions in the driver.

low

When low is selected, the clocks are forced to the lowest power state.

high

When high is selected, the clocks are forced to the highest power state.

manual

When manual is selected, the user can manually adjust which power statesare enabled for each clock domain via the sysfs pp_dpm_mclk, pp_dpm_sclk,and pp_dpm_pcie files and adjust the power state transition heuristicsvia the pp_power_profile_mode sysfs file.

profile_standardprofile_min_sclkprofile_min_mclkprofile_peak

When the profiling modes are selected, clock and power gating aredisabled and the clocks are set for different profiling cases. Thismode is recommended for profiling specific work loads where you donot want clock or power gating for clock fluctuation to interferewith your results. profile_standard sets the clocks to a fixed clocklevel which varies from asic to asic. profile_min_sclk forces the sclkto the lowest level. profile_min_mclk forces the mclk to the lowest level.profile_peak sets all clocks (mclk, sclk, pcie) to the highest levels.

pp_table

The amdgpu driver provides a sysfs API for uploading new powerplaytables. The file pp_table is used for this. Reading the filewill dump the current power play table. Writing to the filewill attempt to upload a new powerplay table and re-initializepowerplay using that new table.

pp_od_clk_voltage

The amdgpu driver provides a sysfs API for adjusting the clocks and voltagesin each power level within a power state. The pp_od_clk_voltage is used forthis.

Note that the actual memory controller clock rate are exposed, notthe effective memory clock of the DRAMs. To translate it, use thefollowing formula:

Clock conversion (Mhz):

HBM: effective_memory_clock = memory_controller_clock * 1

G5: effective_memory_clock = memory_controller_clock * 1

G6: effective_memory_clock = memory_controller_clock * 2

DRAM data rate (MT/s):

HBM: effective_memory_clock * 2 = data_rate

G5: effective_memory_clock * 4 = data_rate

G6: effective_memory_clock * 8 = data_rate

Bandwidth (MB/s):

data_rate * vram_bit_width / 8 = memory_bandwidth

Some examples:

G5 on RX460:

memory_controller_clock = 1750 Mhz

effective_memory_clock = 1750 Mhz * 1 = 1750 Mhz

data rate = 1750 * 4 = 7000 MT/s

memory_bandwidth = 7000 * 128 bits / 8 = 112000 MB/s

G6 on RX5700:

memory_controller_clock = 875 Mhz

effective_memory_clock = 875 Mhz * 2 = 1750 Mhz

data rate = 1750 * 8 = 14000 MT/s

memory_bandwidth = 14000 * 256 bits / 8 = 448000 MB/s

< For Vega10 and previous ASICs >

Reading the file will display:

  • a list of engine clock levels and voltages labeled OD_SCLK

  • a list of memory clock levels and voltages labeled OD_MCLK

  • a list of valid ranges for sclk, mclk, and voltage labeled OD_RANGE

To manually adjust these settings, first select manual usingpower_dpm_force_performance_level. Enter a new value for eachlevel by writing a string that contains "s/m level clock voltage" tothe file. E.g., "s 1 500 820" will update sclk level 1 to be 500 MHzat 820 mV; "m 0 350 810" will update mclk level 0 to be 350 MHz at810 mV. When you have edited all of the states as needed, write"c" (commit) to the file to commit your changes. If you want to reset to thedefault power levels, write "r" (reset) to the file to reset them.

< For Vega20 and newer ASICs >

Reading the file will display:

  • minimum and maximum engine clock labeled OD_SCLK

  • minimum(not available for Vega20 and Navi1x) and maximum memoryclock labeled OD_MCLK

  • three <frequency, voltage> points labeled OD_VDDC_CURVE.They can be used to calibrate the sclk voltage curve. This isavailable for Vega20 and NV1X.

  • voltage offset for the six anchor points of the v/f curve labeledOD_VDDC_CURVE. They can be used to calibrate the v/f curve. Thisis only availabe for some SMU13 ASICs.

  • voltage offset(in mV) applied on target voltage calculation.This is available for Sienna Cichlid, Navy Flounder and DimgreyCavefish. For these ASICs, the target voltage calculation can beillustrated by "voltage = voltage calculated from v/f curve +overdrive vddgfx offset"

  • a list of valid ranges for sclk, mclk, and voltage curve pointslabeled OD_RANGE

< For APUs >

Reading the file will display:

  • minimum and maximum engine clock labeled OD_SCLK

  • a list of valid ranges for sclk labeled OD_RANGE

< For VanGogh >

Reading the file will display:

  • minimum and maximum engine clock labeled OD_SCLK

  • minimum and maximum core clocks labeled OD_CCLK

  • a list of valid ranges for sclk and cclk labeled OD_RANGE

To manually adjust these settings:

  • First select manual using power_dpm_force_performance_level

  • For clock frequency setting, enter a new value by writing astring that contains "s/m index clock" to the file. The indexshould be 0 if to set minimum clock. And 1 if to set maximumclock. E.g., "s 0 500" will update minimum sclk to be 500 MHz."m 1 800" will update maximum mclk to be 800Mhz. For coreclocks on VanGogh, the string contains "p core index clock".E.g., "p 2 0 800" would set the minimum core clock on core2 to 800Mhz.

    For sclk voltage curve,
    • For NV1X, enter the new values by writing a string thatcontains "vc point clock voltage" to the file. The pointsare indexed by 0, 1 and 2. E.g., "vc 0 300 600" will updatepoint1 with clock set as 300Mhz and voltage as 600mV. "vc 21000 1000" will update point3 with clock set as 1000Mhz andvoltage 1000mV.

    • For SMU13 ASICs, enter the new values by writing a string thatcontains "vc anchor_point_index voltage_offset" to the file.There are total six anchor points defined on the v/f curve withindex as 0 - 5.- "vc 0 10" will update the voltage offset for point1 as 10mv.- "vc 5 -10" will update the voltage offset for point6 as -10mv.

    To update the voltage offset applied for gfxclk/voltage calculation,enter the new value by writing a string that contains "vo offset".This is supported by Sienna Cichlid, Navy Flounder and Dimgrey Cavefish.And the offset can be a positive or negative value.

  • When you have edited all of the states as needed, write "c" (commit)to the file to commit your changes

  • If you want to reset to the default power levels, write "r" (reset)to the file to reset them

pp_dpm_*

The amdgpu driver provides a sysfs API for adjusting what power levelsare enabled for a given power state. The files pp_dpm_sclk, pp_dpm_mclk,pp_dpm_socclk, pp_dpm_fclk, pp_dpm_dcefclk and pp_dpm_pcie are used forthis.

pp_dpm_socclk and pp_dpm_dcefclk interfaces are only available forVega10 and later ASICs.pp_dpm_fclk interface is only available for Vega20 and later ASICs.

Reading back the files will show you the available power levels withinthe power state and the clock information for those levels.

To manually adjust these states, first select manual usingpower_dpm_force_performance_level.Secondly, enter a new value for each level by inputing a string thatcontains " echo xx xx xx > pp_dpm_sclk/mclk/pcie"E.g.,

echo "4 5 6" > pp_dpm_sclk

will enable sclk levels 4, 5, and 6.

NOTE: change to the dcefclk max dpm level is not supported now

pp_power_profile_mode

The amdgpu driver provides a sysfs API for adjusting the heuristicsrelated to switching between power levels in a power state. The filepp_power_profile_mode is used for this.

Reading this file outputs a list of all of the predefined power profilesand the relevant heuristics settings for that profile.

To select a profile or create a custom profile, first select manual usingpower_dpm_force_performance_level. Writing the number of a predefinedprofile to pp_power_profile_mode will enable those heuristics. Tocreate a custom set of heuristics, write a string of numbers to the filestarting with the number of the custom profile along with a settingfor each heuristic parameter. Due to differences across asic familiesthe heuristic parameters vary from family to family.

*_busy_percent

The amdgpu driver provides a sysfs API for reading how busy the GPUis as a percentage. The file gpu_busy_percent is used for this.The SMU firmware computes a percentage of load based on theaggregate activity level in the IP cores.

The amdgpu driver provides a sysfs API for reading how busy the VRAMis as a percentage. The file mem_busy_percent is used for this.The SMU firmware computes a percentage of load based on theaggregate activity level in the IP cores.

gpu_metrics

The amdgpu driver provides a sysfs API for retrieving current gpumetrics data. The file gpu_metrics is used for this. Reading thefile will dump all the current gpu metrics data.

These data include temperature, frequency, engines utilization,power consume, throttler status, fan speed and cpu core statistics(available for APU only). That's it will give a snapshot of all sensorsat the same time.

GFXOFF

GFXOFF is a feature found in most recent GPUs that saves power at runtime. Thecard's RLC (RunList Controller) firmware powers off the gfx enginedynamically when there is no workload on gfx or compute pipes. GFXOFF is on bydefault on supported GPUs.

Userspace can interact with GFXOFF through a debugfs interface (all values inuint32_t, unless otherwise noted):

amdgpu_gfxoff

Use it to enable/disable GFXOFF, and to check if it's current enabled/disabled:

$ xxd -l1 -p /sys/kernel/debug/dri/0/amdgpu_gfxoff01
  • Write 0 to disable it, and 1 to enable it.

  • Read 0 means it's disabled, 1 it's enabled.

If it's enabled, that means that the GPU is free to enter into GFXOFF mode asneeded. Disabled means that it will never enter GFXOFF mode.

amdgpu_gfxoff_status

Read it to check current GFXOFF's status of a GPU:

$ xxd -l1 -p /sys/kernel/debug/dri/0/amdgpu_gfxoff_status02
  • 0: GPU is in GFXOFF state, the gfx engine is powered down.

  • 1: Transition out of GFXOFF state

  • 2: Not in GFXOFF state

  • 3: Transition into GFXOFF state

If GFXOFF is enabled, the value will be transitioning around [0, 3], alwaysgetting into 0 when possible. When it's disabled, it's always at 2. Returns-EINVAL if it's not supported.

amdgpu_gfxoff_count

Read it to get the total GFXOFF entry count at the time of query since systempower-up. The value is an uint64_t type, however, due to firmware limitations,it can currently overflow as an uint32_t. Only supported in vangogh

amdgpu_gfxoff_residency

Write 1 to amdgpu_gfxoff_residency to start logging, and 0 to stop. Read it toget average GFXOFF residency % multiplied by 100 during the last logginginterval. E.g. a value of 7854 means 78.54% of the time in the last logginginterval the GPU was in GFXOFF mode. Only supported in vangogh

Top Articles
Latest Posts
Article information

Author: Aracelis Kilback

Last Updated: 04/09/2023

Views: 6390

Rating: 4.3 / 5 (44 voted)

Reviews: 91% of readers found this page helpful

Author information

Name: Aracelis Kilback

Birthday: 1994-11-22

Address: Apt. 895 30151 Green Plain, Lake Mariela, RI 98141

Phone: +5992291857476

Job: Legal Officer

Hobby: LARPing, role-playing games, Slacklining, Reading, Inline skating, Brazilian jiu-jitsu, Dance

Introduction: My name is Aracelis Kilback, I am a nice, gentle, agreeable, joyous, attractive, combative, gifted person who loves writing and wants to share my knowledge and understanding with you.