wiki:Tools/itac/example_rank_league

Example: Intel Trace Analyzer on benchmark rank_league

  • Prepare environment
    # Remove all modules
    module purge
    
    # Load Intel Parallel Studio XE Cluster Edition environment
    source /opt/intel/parallel_studio_xe_2018/bin/psxevars.sh
    
  • Build rank_league benchmark with instrumentation
    export MPICC="mpiicc"
    export CFLAGS="-Ofast -xHost"
    
    # -tcollect
    #          inserts instrumentation probes calling the Intel(R) Trace Collector
    #          API.
    
    # -trace
    #          Use the -t or -trace option to link the resulting executable file
    #          against the Intel(R) Trace Collector library.
    
    # -g
    #          Produce symbolic debug information.
    
    CFLAGS+=" -tcollect -trace -g"
    
    ${MPICC} ${CFLAGS} rank_league.blocked.c -o rank_league
    
  • Create job script file rank_league.jobscript
    #MSUB -l nodes=4:ppn=1
    #MSUB -l walltime=00:20:00
    
    # Remove all modules
    module purge
    
    # Load Intel Parallel Studio XE Cluster Edition environment
    source /opt/intel/parallel_studio_xe_2018/bin/psxevars.sh
    
    # Configure trace collector
    export VT_LOGFILE_FORMAT=SINGLESTF      # Create only one trace file
    export VT_LOGFILE_NAME=rank_league.stf  # Name of trace file
    
    # Configure mpirun
    export I_MPI_HYDRA_BOOTSTRAP="pbsdsh"
    export I_MPI_HYDRA_BRANCH_COUNT="-1"
    
    # Execute rank_league benchmark
    #   test_type:   b - banwidth (default)
    #   output_type: s - statistics per rank - average, min, max
    #   loop_num:    number of loops per every round (default is 200)
    mpirun \
        -print-rank-map -binding domain=core \
        ./rank_league -t=b -o=s -l=200
    
  • Execute instrumented rand_league benchmark and create trace file rank_league.stf
    msub < rank_league.jobscript
    
    Intel(R) Parallel Studio XE 2018 Update 1 for Linux*
    Copyright (C) 2009-2017 Intel Corporation. All rights reserved.
    
    (fhcn0004:0)
    (fhcn0003:1)
    (fhcn0002:2)
    (fhcn0001:3)
    
    ****** Running bandwidth test ********
    Total number of rounds:          3
    Total number of loops per round: 200
    Message size:                    100000
    **************************************
    Round number      1... 2... 3...
    **************************************
    RANK           MIN                MAX               AVERAGE
               RESULT  RANK        RESULT  RANK
    ___________________________________________________________
     0        6112.36   3          6127.54   2          6121.29
     1        6101.25   2          6127.54   0          6111.79
     2        6103.03   3          6148.20   0          6131.19
     3        6107.02   1          6129.34   2          6118.77
    ___________________________________________________________
    Global statistics:
     MIN     6101.25 between 1 and 2
     MAX     6148.20 between 2 and 0
     AVERAGE 6120.76
    
    [0] Intel(R) Trace Collector INFO: Writing tracefile rank_league.stf in ...
    
  • Analyze trace file
    traceanalyzer rank_league.stf
    
Last modified 12 months ago Last modified on Apr 10, 2018, 2:00:35 PM