wiki:Tools/likwid/example_perfctr_stream

Example likwid-perfctr performance group MEM on benchmark stream

  • Build stream benchmark with GNU compiler
    module purge
    module add compiler/gnu/7
    gcc -std=c11 -Ofast -march=native -flto -fopenmp \
         stream.c -o stream
    
  • Set up OpenMP environment
    export OMP_NUM_THREADS=20
    
  • List available performance groups
    likwid-perfctr -a
    
    ...
            MEM     Main memory bandwidth in MBytes/s
         MEM_DP     Overview of arithmetic and main memory performance
         MEM_SP     Overview of arithmetic and main memory performance
           NUMA     Local and remote data transfers
    ...
    
  • Get detailed information on performance groups
    likwid-perfctr -H --group MEM
    
    Group MEM:
    Formulas:
    Memory read bandwidth [MBytes/s] = 1.0E-06*(SUM(MBOXxC0))*64.0/time
    Memory read data volume [GBytes] = 1.0E-09*(SUM(MBOXxC0))*64.0
    Memory write bandwidth [MBytes/s] = 1.0E-06*(SUM(MBOXxC1))*64.0/time
    Memory write data volume [GBytes] = 1.0E-09*(SUM(MBOXxC1))*64.0
    Memory bandwidth [MBytes/s] = 1.0E-06*(SUM(MBOXxC0)+SUM(MBOXxC1))*64.0/time
    Memory data volume [GBytes] = 1.0E-09*(SUM(MBOXxC0)+SUM(MBOXxC1))*64.0
    -
    Profiling group to measure memory bandwidth drawn by all cores of a socket.
    Since this group is based on Uncore events it is only possible to measure on a
    per socket base. Also outputs total data volume transferred from main memory.
    
  • Messure performance group MEM for benchmark stream on CPU 0 to 19
    likwid-perfctr \
        --group MEM \
        -C 0-19  \
        ./stream -n 100000000
    
    --------------------------------------------------------------------------------
    CPU name:       Intel(R) Xeon(R) CPU E5-2650 v3 @ 2.30GHz
    CPU type:       Intel Xeon Haswell EN/EP/EX processor
    CPU clock:      2.30 GHz
    --------------------------------------------------------------------------------
    
    -------------------------------------------------------------
    STREAM version $Revision: 5.10 $
    -------------------------------------------------------------
    This system uses 8 bytes per array element.
    -------------------------------------------------------------
    Array size = 100000000 (elements) (elements)
    Memory per array = 762.9 MiB (= 0.7 GiB).
    Total memory required = 2288.8 MiB (= 2.2 GiB).
    Each kernel will be executed 10 times.
     The *best* time for each kernel (excluding the first iteration)
     will be used to compute the reported bandwidth.
    -------------------------------------------------------------
    Number of Threads requested = 20
    Number of Threads counted = 20
    -------------------------------------------------------------
    Your clock granularity/precision appears to be 1 microseconds.
    Each test below will take on the order of 15955 microseconds.
       (= 15955 clock ticks)
    Increase the size of the arrays if this shows that
    you are not getting at least 20 clock ticks per test.
    -------------------------------------------------------------
    WARNING -- The above is only a rough guideline.
    For best results, please be sure you know the
    precision of your system timer.
    -------------------------------------------------------------
    Function    Best Rate MB/s  Avg time     Min time     Max time
    Copy:          102387.3     0.016643     0.015627     0.020727
    Scale:          72944.7     0.022326     0.021934     0.024904
    Add:            81663.2     0.029859     0.029389     0.032404
    Triad:          81578.8     0.029487     0.029419     0.029520
    -------------------------------------------------------------
    Solution Validates: avg error less than 1.000000e-13 on all three arrays
    -------------------------------------------------------------
    
    --------------------------------------------------------------------------------
    Group 1: MEM
    +-----------------------+---------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+
    |         Event         | Counter |   Core 0   |   Core 1   |   Core 2   |   Core 3   |   Core 4   |   Core 5   |   Core 6   |   Core 7   |   Core 8   |   Core 9   |   Core 10  |   Core 11  |   Core 12  |   Core 13  |   Core 14  |   Core 15  |   Core 16  |   Core 17  |   Core 18  |   Core 19  |
    +-----------------------+---------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+
    |   INSTR_RETIRED_ANY   |  FIXC0  |  938295039 |  860131666 |  844442515 |  862490745 |  855282747 |  863090338 |  869367586 |  862849791 |  851971039 |  866514428 |  851549344 |  851361100 |  844113575 |  864512780 |  851730705 |  859556446 |  843886084 |  860724500 |  872205627 |  872122155 |
    | CPU_CLK_UNHALTED_CORE |  FIXC1  | 2618345331 | 2536800836 | 2551076323 | 2534122499 | 2547936397 | 2548491438 | 2539820912 | 2532567385 | 2542849578 | 2546347750 | 2551713389 | 2551333284 | 2537147110 | 2529932248 | 2551593906 | 2542084452 | 2541590035 | 2546862115 | 2644565960 | 2550826845 |
    |  CPU_CLK_UNHALTED_REF |  FIXC2  | 2308912178 | 2244121316 | 2256404949 | 2240746434 | 2253571004 | 2254085192 | 2246398592 | 2239977222 | 2248706734 | 2251983751 | 2257285021 | 2256941125 | 2243232274 | 2237576712 | 2256949681 | 2247969906 | 2246809234 | 2252129709 | 2334964324 | 2255982393 |
    |      CAS_COUNT_RD     | MBOX0C0 |  149185857 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |  147906384 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |
    |      CAS_COUNT_WR     | MBOX0C1 |   69185518 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |   69109259 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |
    |      CAS_COUNT_RD     | MBOX1C0 |  152486380 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |  147870827 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |
    |      CAS_COUNT_WR     | MBOX1C1 |   69626367 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |   69075586 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |
    |      CAS_COUNT_RD     | MBOX2C0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |
    |      CAS_COUNT_WR     | MBOX2C1 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |
    |      CAS_COUNT_RD     | MBOX3C0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |
    |      CAS_COUNT_WR     | MBOX3C1 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |
    |      CAS_COUNT_RD     | MBOX4C0 |  149851128 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |  147883074 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |
    |      CAS_COUNT_WR     | MBOX4C1 |   69262252 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |   69133632 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |
    |      CAS_COUNT_RD     | MBOX5C0 |  149850079 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |  147845659 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |
    |      CAS_COUNT_WR     | MBOX5C1 |   69420339 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |   69101844 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |
    |      CAS_COUNT_RD     | MBOX6C0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |
    |      CAS_COUNT_WR     | MBOX6C1 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |
    |      CAS_COUNT_RD     | MBOX7C0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |
    |      CAS_COUNT_WR     | MBOX7C1 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |
    +-----------------------+---------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+
    
    +----------------------------+---------+-------------+------------+------------+--------------+
    |            Event           | Counter |     Sum     |     Min    |     Max    |      Avg     |
    +----------------------------+---------+-------------+------------+------------+--------------+
    |   INSTR_RETIRED_ANY STAT   |  FIXC0  | 17246198210 |  843886084 |  938295039 | 8.623099e+08 |
    | CPU_CLK_UNHALTED_CORE STAT |  FIXC1  | 51046007793 | 2529932248 | 2644565960 | 2.552300e+09 |
    |  CPU_CLK_UNHALTED_REF STAT |  FIXC2  | 45134747751 | 2237576712 | 2334964324 | 2.256737e+09 |
    |      CAS_COUNT_RD STAT     | MBOX0C0 |   297092241 |          0 |  149185857 | 1.485461e+07 |
    |      CAS_COUNT_WR STAT     | MBOX0C1 |   138294777 |          0 |   69185518 | 6.914739e+06 |
    |      CAS_COUNT_RD STAT     | MBOX1C0 |   300357207 |          0 |  152486380 | 1.501786e+07 |
    |      CAS_COUNT_WR STAT     | MBOX1C1 |   138701953 |          0 |   69626367 | 6.935098e+06 |
    |      CAS_COUNT_RD STAT     | MBOX2C0 |           0 |          0 |          0 |            0 |
    |      CAS_COUNT_WR STAT     | MBOX2C1 |           0 |          0 |          0 |            0 |
    |      CAS_COUNT_RD STAT     | MBOX3C0 |           0 |          0 |          0 |            0 |
    |      CAS_COUNT_WR STAT     | MBOX3C1 |           0 |          0 |          0 |            0 |
    |      CAS_COUNT_RD STAT     | MBOX4C0 |   297734202 |          0 |  149851128 | 1.488671e+07 |
    |      CAS_COUNT_WR STAT     | MBOX4C1 |   138395884 |          0 |   69262252 | 6.919794e+06 |
    |      CAS_COUNT_RD STAT     | MBOX5C0 |   297695738 |          0 |  149850079 | 1.488479e+07 |
    |      CAS_COUNT_WR STAT     | MBOX5C1 |   138522183 |          0 |   69420339 | 6.926109e+06 |
    |      CAS_COUNT_RD STAT     | MBOX6C0 |           0 |          0 |          0 |            0 |
    |      CAS_COUNT_WR STAT     | MBOX6C1 |           0 |          0 |          0 |            0 |
    |      CAS_COUNT_RD STAT     | MBOX7C0 |           0 |          0 |          0 |            0 |
    |      CAS_COUNT_WR STAT     | MBOX7C1 |           0 |          0 |          0 |            0 |
    +----------------------------+---------+-------------+------------+------------+--------------+
    
    +-----------------------------------+------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+
    |               Metric              |   Core 0   |   Core 1  |   Core 2  |   Core 3  |   Core 4  |   Core 5  |   Core 6  |   Core 7  |   Core 8  |   Core 9  |   Core 10  |  Core 11  |  Core 12  |  Core 13  |  Core 14  |  Core 15  |  Core 16  |  Core 17  |  Core 18  |  Core 19  |
    +-----------------------------------+------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+
    |        Runtime (RDTSC) [s]        |     1.4661 |    1.4661 |    1.4661 |    1.4661 |    1.4661 |    1.4661 |    1.4661 |    1.4661 |    1.4661 |    1.4661 |     1.4661 |    1.4661 |    1.4661 |    1.4661 |    1.4661 |    1.4661 |    1.4661 |    1.4661 |    1.4661 |    1.4661 |
    |        Runtime unhalted [s]       |     1.1384 |    1.1030 |    1.1092 |    1.1018 |    1.1078 |    1.1081 |    1.1043 |    1.1011 |    1.1056 |    1.1071 |     1.1095 |    1.1093 |    1.1031 |    1.1000 |    1.1094 |    1.1053 |    1.1051 |    1.1073 |    1.1498 |    1.1091 |
    |            Clock [MHz]            |  2608.1950 | 2599.9236 | 2600.3210 | 2601.0904 | 2600.3864 | 2600.3596 | 2600.3801 | 2600.3868 | 2600.8087 | 2600.5967 |  2599.9563 | 2599.9651 | 2601.3091 | 2600.4680 | 2600.2208 | 2600.8783 | 2601.7158 | 2600.9535 | 2604.9219 | 2600.5537 |
    |                CPI                |     2.7905 |    2.9493 |    3.0210 |    2.9381 |    2.9791 |    2.9528 |    2.9215 |    2.9351 |    2.9847 |    2.9386 |     2.9966 |    2.9968 |    3.0057 |    2.9264 |    2.9958 |    2.9574 |    3.0118 |    2.9590 |    3.0320 |    2.9249 |
    |  Memory read bandwidth [MBytes/s] | 26251.9756 |         0 |         0 |         0 |         0 |         0 |         0 |         0 |         0 |         0 | 25821.2260 |         0 |         0 |         0 |         0 |         0 |         0 |         0 |         0 |         0 |
    |  Memory read data volume [GBytes] |    38.4879 |         0 |         0 |         0 |         0 |         0 |         0 |         0 |         0 |         0 |    37.8564 |         0 |         0 |         0 |         0 |         0 |         0 |         0 |         0 |         0 |
    | Memory write bandwidth [MBytes/s] | 12113.5682 |         0 |         0 |         0 |         0 |         0 |         0 |         0 |         0 |         0 | 12066.6777 |         0 |         0 |         0 |         0 |         0 |         0 |         0 |         0 |         0 |
    | Memory write data volume [GBytes] |    17.7596 |         0 |         0 |         0 |         0 |         0 |         0 |         0 |         0 |         0 |    17.6909 |         0 |         0 |         0 |         0 |         0 |         0 |         0 |         0 |         0 |
    |    Memory bandwidth [MBytes/s]    | 38365.5438 |         0 |         0 |         0 |         0 |         0 |         0 |         0 |         0 |         0 | 37887.9037 |         0 |         0 |         0 |         0 |         0 |         0 |         0 |         0 |         0 |
    |    Memory data volume [GBytes]    |    56.2475 |         0 |         0 |         0 |         0 |         0 |         0 |         0 |         0 |         0 |    55.5473 |         0 |         0 |         0 |         0 |         0 |         0 |         0 |         0 |         0 |
    +-----------------------------------+------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+
    
    +----------------------------------------+------------+-----------+------------+-----------+
    |                 Metric                 |     Sum    |    Min    |     Max    |    Avg    |
    +----------------------------------------+------------+-----------+------------+-----------+
    |        Runtime (RDTSC) [s] STAT        |    29.3220 |    1.4661 |     1.4661 |    1.4661 |
    |        Runtime unhalted [s] STAT       |    22.1943 |    1.1000 |     1.1498 |    1.1097 |
    |            Clock [MHz] STAT            | 52023.3908 | 2599.9236 |  2608.1950 | 2601.1695 |
    |                CPI STAT                |    59.2171 |    2.7905 |     3.0320 |    2.9609 |
    |  Memory read bandwidth [MBytes/s] STAT | 52073.2016 |         0 | 26251.9756 | 2603.6601 |
    |  Memory read data volume [GBytes] STAT |    76.3443 |         0 |    38.4879 |    3.8172 |
    | Memory write bandwidth [MBytes/s] STAT | 24180.2459 |         0 | 12113.5682 | 1209.0123 |
    | Memory write data volume [GBytes] STAT |    35.4505 |         0 |    17.7596 |    1.7725 |
    |    Memory bandwidth [MBytes/s] STAT    | 76253.4475 |         0 | 38365.5438 | 3812.6724 |
    |    Memory data volume [GBytes] STAT    |   111.7948 |         0 |    56.2475 |    5.5897 |
    +----------------------------------------+------------+-----------+------------+-----------+
    
    • All memory related performance counters are only accounted on first CPU core on the socket
  • Validity check
    Socket 0:
    Memory  read bandwidth: 262519756
    Memory write bandwidth: 121135682
    +                       ---------
                            383655438
    Memory bandwidth:       383655438
    
    Memory write data volume socket 0:      17.7596 GB
    Memory write data volume socket 1:      17.6909 GB
    +                                       ----------
                                            35.4505 GB
    Memory write data volume [GBytes] STAT: 35.4505 GB
    
    Memory read data volume socket 0:      38.4879 GB
    Memory read data volume socket 1:      37.8564 GB
    +                                      ----------
                                           76.3443 GB
    Memory read data volume [GBytes] STAT: 76.3443 GB
    
    #Elements/vec   = 100.000.000
    #Bytes/Element  = 8
    #Bytes/vec      = 100.000.000 * 8 = 800.000.000
    #Num repetition = 10
    
    Copy:  1 Vec. read, 1 Vec. write
    Scale: 1 Vec. read, 1 Vec. write
    Add:   2 Vec. read, 1 Vec. write
    Triad: 2 Vec. read, 1 Vec. write
    
    4 vec. write * 10 repetition * 800.000.000 Bytes/vec = 32 GB
    ~ 35.4505 Memory write data volume [GBytes] STAT
    
    6 Vec. read * 10 repetition * 800.000.000 Bytes/vec = 48 GB
    !~ 76.3443 Memory read data volume [GBytes] STAT
    
    6 Vec. read + 4 Vec. write * 10 repetition * 800.000.000 Bytes/vec = 80 GB
    ~ 76.3443 Memory read data volume [GBytes] STAT
    
    • Each store to memory triggers an extra read from memory. => GNU compiler does not use non-temporal stores which can directly write to memory.

Example likwid-perfctr performance group NUMA on benchmark stream

  • Build stream benchmark with Intel compiler
    module purge
    module add compiler/intel/18.0
    icc -std=c11 -Ofast -xHost -ipo -qopenmp \
         stream.c -o stream
    
  • Set up OpenMP environment
    export OMP_NUM_THREADS=20
    
  • List available performance groups
    likwid-perfctr -a
    
    ...
            MEM     Main memory bandwidth in MBytes/s
         MEM_DP     Overview of arithmetic and main memory performance
         MEM_SP     Overview of arithmetic and main memory performance
           NUMA     Local and remote data transfers
    ...
    
  • Get detailed information on performance groups
    likwid-perfctr -H --group NUMA
    
    Group NUMA:
    Formula:
    CPI = CPU_CLK_UNHALTED_CORE/INSTR_RETIRED_ANY
    Local DRAM data volume [GByte] = 1.E-09*OFFCORE_RESPONSE_0_LOCAL_DRAM*64
    Local DRAM bandwidth [MByte/s] = 1.E-06*(OFFCORE_RESPONSE_0_LOCAL_DRAM*64)/time
    Remote DRAM data volume [GByte] = 1.E-09*OFFCORE_RESPONSE_1_REMOTE_DRAM*64
    Remote DRAM bandwidth [MByte/s] = 1.E-06*(OFFCORE_RESPONSE_1_REMOTE_DRAM*64)/time
    Memory data volume [GByte] = 1.E-09*(OFFCORE_RESPONSE_0_LOCAL_DRAM+OFFCORE_RESPONSE_1_REMOTE_DRAM)*64
    Memory bandwidth [MByte/s] = 1.E-06*((OFFCORE_RESPONSE_0_LOCAL_DRAM+OFFCORE_RESPONSE_1_REMOTE_DRAM)*64)/time
    --
    This performance group measures the data traffic of CPU cores to local and remote
    memory.
    
  • Messure performance group NUMA for benchmark stream on CPU 0 to 19 with locally allocated memory
    likwid-perfctr --group NUMA -C 0-19 \
        numactl --localalloc \
           ./stream -n 100000000
    
    ...
    -------------------------------------------------------------
    Function    Best Rate MB/s  Avg time     Min time     Max time
    Copy:          104573.5     0.015537     0.015300     0.015842
    Scale:         105859.6     0.015214     0.015114     0.015308
    Add:           108120.1     0.022280     0.022198     0.022395
    Triad:         109300.7     0.021987     0.021958     0.022040
    -------------------------------------------------------------
    
    ...
    +--------------------------------------+------------+--------------+-----------+-----------+
    |                Metric                |     Sum    |      Min     |    Max    |    Avg    |
    +--------------------------------------+------------+--------------+-----------+-----------+
    |       Runtime (RDTSC) [s] STAT       |    37.9020 |       1.8951 |    1.8951 |    1.8951 |
    |       Runtime unhalted [s] STAT      |    18.2252 |       0.8769 |    1.3908 |    0.9113 |
    |           Clock [MHz] STAT           | 58088.7700 |    2899.9413 | 2989.1844 | 2904.4385 |
    |               CPI STAT               |   128.2410 |       1.0635 |    7.0634 |    6.4120 |
    |  Local DRAM data volume [GByte] STAT |    13.3097 |       0.6477 |    0.6756 |    0.6655 |
    |  Local DRAM bandwidth [MByte/s] STAT |  7023.3007 |     341.7686 |  356.5150 |  351.1650 |
    | Remote DRAM data volume [GByte] STAT |     0.0063 | 2.496000e-05 |    0.0008 |    0.0003 |
    | Remote DRAM bandwidth [MByte/s] STAT |     3.2454 |       0.0132 |    0.4004 |    0.1623 |
    |    Memory data volume [GByte] STAT   |    13.3158 |       0.6481 |    0.6758 |    0.6658 |
    |    Memory bandwidth [MByte/s] STAT   |  7026.5460 |     342.0066 |  356.5833 |  351.3273 |
    +--------------------------------------+------------+--------------+-----------+-----------+
    
    • Remote DRAM data volume and Remote DRAM bandwidth are very low
  • Messure performance group NUMA for benchmark stream on CPU 0 to 19 with all allocated memory in NUMA domain 0
    likwid-perfctr --group NUMA -C 0-19 \
        numactl --membind=0 \
           ./stream -n 100000000
    
    ...
    -------------------------------------------------------------
    Function    Best Rate MB/s  Avg time     Min time     Max time
    Copy:           50143.6     0.031936     0.031908     0.031993
    Scale:          49960.4     0.032053     0.032025     0.032086
    Add:            56319.0     0.042653     0.042614     0.042680
    Triad:          56425.9     0.042577     0.042534     0.042612
    -------------------------------------------------------------
    
    ...
    +--------------------------------------+------------+--------------+-----------+-----------+
    |                Metric                |     Sum    |      Min     |    Max    |    Avg    |
    +--------------------------------------+------------+--------------+-----------+-----------+
    |       Runtime (RDTSC) [s] STAT       |    53.4480 |       2.6724 |    2.6724 |    2.6724 |
    |       Runtime unhalted [s] STAT      |    34.9158 |       1.6857 |    2.1850 |    1.7458 |
    |           Clock [MHz] STAT           | 58063.6648 |    2899.9915 | 2963.3627 | 2903.1832 |
    |               CPI STAT               |   167.8990 |       1.3344 |   14.4638 |    8.3950 |
    |  Local DRAM data volume [GByte] STAT |     6.5933 | 7.744000e-06 |    0.6628 |    0.3297 |
    |  Local DRAM bandwidth [MByte/s] STAT |  2467.1862 |       0.0029 |  248.0175 |  123.3593 |
    | Remote DRAM data volume [GByte] STAT |     6.6188 |            0 |    0.6689 |    0.3309 |
    | Remote DRAM bandwidth [MByte/s] STAT |  2476.7374 |            0 |  250.3028 |  123.8369 |
    |    Memory data volume [GByte] STAT   |    13.2118 |       0.6343 |    0.6689 |    0.6606 |
    |    Memory bandwidth [MByte/s] STAT   |  4943.9239 |     237.3654 |  250.3130 |  247.1962 |
    +--------------------------------------+------------+--------------+-----------+-----------+
    
    • Remote DRAM data volume and Remote DRAM bandwidth are very high
    • Memory bandwidth halved
Last modified 5 days ago Last modified on Apr 5, 2019, 10:17:06 AM