使用GVProf测试Python程序

以pytorch/benchmark的alexnet为例，正常执行的命令是python3 run.py alexnet -d cuda -t eval

hpcrun -e gpu=nvidia     python3 run.py alexnet  -d cuda  -t eval
rm hpctoolkit-python3-measurements/*hpcrun
hpcrun -e gpu=nvidia,data_flow -o hpctoolkit-python3-measurements/  -ck HPCRUN_SANITIZER_READ_TRACE_IGNORE=1 -ck HPCRUN_SANITIZER_GPU_PATCH_RECORD_NUM=531072 -ck HPCRUN_SANITIZER_GPU_ANALYSIS_BLOCKS=1    python3 run.py alexnet  -d cuda  -t eval
hpcprof hpctoolkit-python3-measurements/

等待几分钟后，hpcprof会卡住，这是因为hpctoolkit无法找到正在解析的一些库。使用gdb -p pidof hpcprof-bin来调试hpcprof-bin。

在卡住的线程bt找到Seg.cpp:166 (in BinUtil::TextSeg::TextSeg )的frame，通过f number进入该frame，使用p *m_lm获得m_name，即当前正在解析的库的文件的路径。使用hpcstruct lib_path解析该库(可以添加-j 4来使用4个线程解析该库，但会造成微量信息的丢失)，会生成一个对应的hpcstruct文件。再重新执行hpcprof -S lib.hpcstruct hpctoolkit-python3-measurements/。如果还是会卡住，则重复上述步骤添加需要手动解析的库。一般pytorch的程序只需要手动解libtorch_cpu.so libtorch_cuda.so libtorch_python.so 库即可

gviewer -v -f ./data_flow.dot.context -cf file -pr

生成最后的value flow图

Related Posts:

Comments