NVIDIA GPU默认是自动调频,如果需要profile程序,通过ncu获得的结果可能会不准确,尤其是对于一些比较小的程序来说。因此,测试需要固定gpu的时钟频率。需要执行的命令如下:
sudo nvidia-smi -pm 1
nvidia-smi -q -d CLOCK
sudo nvidia-smi -lgc 2100,2100
nvidia-smi -q -d CLOCK
-pm, --persistence-mode= Set persistence mode: 0/DISABLED, 1/ENABLED
-q, --query Display GPU or Unit info.
-d, --display= Display only selected information: MEMORY,
UTILIZATION, ECC, TEMPERATURE, POWER, CLOCK,
COMPUTE, PIDS, PERFORMANCE, SUPPORTED_CLOCKS,
PAGE_RETIREMENT, ACCOUNTING, ENCODER_STATS,
SUPPORTED_GPU_TARGET_TEMP, VOLTAGE
FBC_STATS, ROW_REMAPPER
Flags can be combined with comma e.g. ECC,POWER.
Sampling data with max/min/avg is also returned
for POWER, UTILIZATION and CLOCK display types.
Doesn't work with -u or -x flags.
-lgc --lock-gpu-clocks= Specifies <minGpuClock,maxGpuClock> clocks as a
pair (e.g. 1500,1500) that defines the range
of desired locked GPU clock speed in MHz.
Setting this will supercede application clocks
and take effect regardless if an app is running.
Input can also be a singular desired clock value
(e.g. <GpuClockValue>).
首先设置persistence mode为enabled,然后通过-q -d CLOCK
查询当前gpu的最大SM时钟频率为多少,再通过-lgc
设置上下限,可以直接固定为最高频率。最后再次查询下当前的SM时钟频率。注意,最后查询获得的实际频率可能比设置的目标频率低,比如我使用的A3090查询获得最大时钟频率为2100MHz,但以此频率设置时,实际为1980MHz。
在使用ncu profile程序时,要加上--clock-control=none
来阻止ncu控制gpu频率。ncu的 clock-control
参数为
Control the behavior of the GPU clocks during profiling. Allowed values:
- base: GPC and memory clocks are locked to their respective base frequency during profiling. This has no impact on thermal throttling. Note that actual clocks might still vary, depending on the level of driver support for this feature. As an alternative, use nvidia-smi to lock the clocks externally and set this option to none.
- none: No GPC or memory frequencies are changed during profiling.
默认为base。
Reference
Can I fix my GPU clock rate to ensure consistent profiling results?
Comments