Microsoft Diskspd Utility Schnellstart

Vor kurzem bin ich auf das Microsoft Diskspd Utility gestoßen. Eine schnelle und unkomplizierte Möglichkeit einen Disk Performance Test durchzuführen.

Ich möchte hier nicht das leidigen Benchmarking Thema lostreten, bei dem Diskspd Utility geht nur um einen Simplen Test bzw. Last Generierung.

Diskspd Utility – Simpler Aufruf

Set the block size to 8K, run the test for 60 seconds, disable all hardware and software caching, measure and display latency statistics, leverage 2 overlapped IOs and 4 threads per target, random 30% writes and 70% reads and create a 50MB test file at c:\io.dat

.\amd64fre\Diskspd.exe -b8K -d60 -h -L -o2 -t4 -r -w30 -c50M c:\io.dat

Microsoft Diskspd Utility - IO Result Simpel

Diskspd Utility – Ergebnisse mit Aufbereitung

Anpassung am Script

Am Ende des PowerShell Skripts (process-diskspd.ps1) zusätzlich zur Ausgabe in ein Tab-separated File noch die Ausgabe der Variable$o einfügen

foreach ($i in 0..$(([int]($g.name)) - 1)) {
    $($hsrt |% { $dh[$_][$i] }) -join $delim | out-file $outfile -Append
}

$o

Aufruf

.\amd64fre\Diskspd.exe -b8K -d60 -h -L -o2 -t4 -r -w30 -c50M -Rxml c:\io.dat > out.xml
.\process-diskspd.ps1 -xmlresultpath C:\Diskspd-v2.0.17

Microsoft Diskspd Utility - IO Result Enhanced

Natürlich können die Daten auch ganz normal in Excel geladen werden.

Diskspd Utility – Basis Paramter

Parameter Description
-? Displays usage information for DiskSpd.
-ag Group affinity – affinitize threads in a round-robin manner across Processor Groups, starting at group 0.

This is default. Use -n to disable affinity.

-ag#,#[,#,…] Advanced CPU affinity – affinitize threads round-robin to the CPUs provided. The g# notation specifies Processor Groups for the following CPU core #s. Multiple Processor Groups may be specified, and groups/cores may be repeated. If no group is specified, 0 is assumed.

Additional groups/processors may be added, comma separated, or on separate parameters.

Examples:

-a0,1,2 and -ag0,0,1,2 are equivalent.

-ag0,0,1,2,g1,0,1,2 specifies the first three cores in groups 0 and 1. -ag0,0,1,2 -ag1,0,1,2 is an equivalent way of specifying the same pattern with two -ag# arguments.

-b<size>[K|M|G] Block size in bytes or KiB, MiB, or GiB (default = 64K)
-B<offset>[K|M|G|b] Base target offset in bytes or KiB, MiB, GiB, or blocks from the beginning of the target (default offset = zero)
-c<size>[K|M|G|b] Create files of the specified size. Size can be stated in bytes or KiBs, MiBs, GiBs, or blocks.
-C<seconds> Cool down time in seconds – continued duration of the test load after measurements are complete (default = zero seconds).
-D<milliseconds> Capture IOPs higher-order statistics in intervals of <milliseconds>. These are per-thread per-target: text output provides IOPs standard deviation, XML provides the full IOPs time series in addition (default = 1000ms or 1 second).
-d<seconds> Duration of measurement period in seconds, not including cool-down or warm-up time (default = 10 seconds).
-f<size>[K|M|G|b] Target size – use only the first <size> bytes or KiB, MiB, GiB or blocks of the specified targets, for example to test only the first sectors of a disk.
-f<rst> Open file with one or more additional access hints specified to the operating system:

r : the FILE_FLAG_RANDOM_ACCESS hint
s : the FILE_FLAG_SEQUENTIAL_SCAN hint
t : the FILE_ATTRIBUTE_TEMPORARY hint

Note that these hints are generally only applicable to cached IO.

-F<count> Total number of threads. Conflicts with -t, the option to set the number of threads per file.
-g<bytes per ms> Throughput per-thread per-target is throttled to the given number of bytes per millisecond. This option is incompatible with completion routines (-x).
-h Deprecated but still honored; see -Sh.
-i<count> Number of IOs (burst size) to issue before pausing. Must be specified in combination with -j.
-j<milliseconds> Pause in milliseconds before issuing a burst of IOs. Must be specified in combination with -i.
-I<priority> Set IO priority to <priority>. Available values are: 1-very low, 2-low, 3-normal (default).
-l Use large pages for IO buffers.
-L Measure latency statistics. Full per-thread per-target distributions are available with XML result output.
-n Disable default affinity (-a).
-o<count> Number of outstanding I/O requests per target per thread. (1 = synchronous I/O, unless more than 1 thread is specified with by using –F) (default = 2).
-p Start asynchronous (overlapped) I/O operations with the same offset. Only applicable with 2 or more outstanding I/O requests per thread (-o2 or greater)
-P<count> Enable printing a progress dot after the specified each <count> [default = 65536] completed of I/O operations, counted separately by each thread.
-r<alignment>[K|M|G|b] Random I/O aligned to the specified number of <align> bytes or KiB, MiB, GiB, or blocks. Overrides -s.
-R[text|xml] Display test results in either text or XML format (default: text).
-s[i]<size>[K|M|G|b] Sequential stride size, offset between subsequent I/O operations in bytes or KiB, MiB, GiB, or blocks. Ignored if -r specified (default access = sequential, default stride = block size).

By default each thread tracks its own sequential offset. If the optional interlocked (i) qualifier is used, a single interlocked offset is shared between all threads operating on a given target so that the threads cooperatively issue a single sequential pattern of access to the target.

-S[bhruw] This flag modifies the caching and write-through modes for the test target. Any non-conflicting combination of modifiers can be specified (-Sbu conflicts, -Shw specifies w twice), order independent (-Suw and -Swu are equivalent).

By default, caching is on and write-through is not specified.

-S No modifying flags specified: disable software caching. Deprecated but still honored; see -Su.

This opens the target with the FILE_FLAG_NO_BUFFERING flag. This is included in -Sh.

-Sb Enable software cache (default, explicitly stated).

Can be combined with w.

-Sh Disable both software caching and hardware write caching.

This opens the target with the FILE_FLAG_NO_BUFFERING and FILE_FLAG_WRITE_THROUGH flags, and is equivalent to -Suw.

-Sr Disable local caching for remote filesystems. This leaves the remote system’s cache enabled.

Can be combined with w.

-Su Disable software caching, for unbuffered IO.

This opens the target with the FILE_FLAG_NO_BUFFERING flag. This option is equivalent -S with no modifiers. Can be combined with w.

-Sw Enable write-through IO.

This opens the target with the FILE_FLAG_WRITE_THROUGH flag. This can be combined with either buffered (-Sw or -Sbw) or unbuffered IO (-Suw). It is included in -Sh.

Note: SATA HDD will generally not honor write through intent on individual IOs. Devices with persistent write caches – certain enterprise flash drives, and most storage arrays – will complete write-through writes when the write is stable in cache. In both cases, -S / -Su and -Sh / -Suw will see equivalent behavior.

-t<count> Number of threads per target. Conflicts with -F, which specifies the total number of threads.
-T<offset>[K|M|G|b] Stride size between I/O operations performed on the same target by different threads in bytes or KiB, MiB, GiB, or blocks (default stride size = 0; starting offset = base file offset + (<thread number> * <offset>). Makes sense only when number of threads per target > 1.
-v Verbose mode
-w<percentage> Percentage of write requests to issue (default = 0, 100% read). The following are equivalent and result in a 100% read-only workload: omitting -w, specifying -w with no percentage, and -w0.

IMPORTANT: a write test will destroy existing data without a warning.

-W<seconds> Warmup time – duration of the test before measurements start (default = 5 seconds).
-x Use I/O completion routines instead of I/O completion ports for cases specifying more than one IO per thread[1] (see -o). Unless there is a specific reason to explore differences in the completion model, this should generally be left at default.
-X<filepath> Use an XML file for configuring the workload. Cannot be used with other parameters. XML output <Profile> block is a template. See the diskspd.xsd file for details.
-z[seed] Set random seed to specified integer value. With no -z, seed=0. With plain -z, seed is based on system run time.
-Z Zero the per-thread I/O buffers. Relevant for write tests. By default, the buffers are filled with a repeating pattern (0, 1, 2, …, 255, 0, 1, …)
-Z<size>[K|M|G|b] Separate read and write buffers, and initialize a per-target write source buffer sized to the specified number of bytes or KiB, MiB, GiB, or blocks. This write source buffer is initialized with random data, and per-IO write data is selected from it at 4-byte granularity.
-Z<size>[K|M|G|b],<file> Same, but using a file as the source of data to fill the write source buffers.

Quelle: DiskSpd_Documentation

Diskspd Utility – Beispiele

Test description Sample command
Large area random concurrent reads of 4KB blocks diskspd -c2G -b4K -F8 -r -o32 -W60 -d60 -Sh testfile.dat
Large area random concurrent writes of 4KB blocks diskspd -c2G -w -b4K -F8 -r -o32 -W60 -d60 -Sh testfile.dat
Large area random concurrent reads of 64KB blocks diskspd -c2G -b64K -F8 -r -o32 -W60 -d60 -Sh testfile.dat
Large area random concurrent writes of 64KB blocks diskspd -c2G -w -b64K -F8 -r -o32 -W60 -d60 -Sh testfile.dat
Large area random serial reads of 4KB blocks. diskspd -c2G -b4K -r -o1 -W60 -d60 -Sh testfile.dat
Large area random serial writes of 4KB blocks diskspd -c2G -w -b4K -r -o1 -W60 -d60 -Sh testfile.dat
Large area random serial reads of 64KB blocks diskspd -c2G -b64K -r -o1 -W60 -d60 -Sh testfile.dat
Large area random serial writes of 64KB blocks diskspd -c2G -w -b64K -r -o1 -W60 -d60 -Sh testfile.dat
Large area sequential concurrent reads of 4KB blocks diskspd -c2G -b4K -F8 -T1b -s8b -o32 -W60 -d60 -Sh testfile.dat
Large area sequential concurrent writes of 4KB blocks diskspd -c2G -w -b4K -F8 -T1b -s8b -o32 -W60 -d60 -Sh testfile.dat
Large area sequential concurrent reads of 64KB blocks diskspd -c2G -b64K -F8 -T1b -s8b -o32 -W60 -d60 -Sh testfile.dat
Large area sequential concurrent writes of 64KB blocks diskspd -c2G -w -b64K -F8 -T1b -s8b -o32 -W60 -d60 -Sh testfile.dat
Large area sequential serial reads of 4KB blocks diskspd -c2G -b4K -o1 -W60 -d60 -Sh testfile.dat
Large area sequential serial writes of 4KB blocks diskspd -c2G -w -b4K -o1 -W60 -d60 -Sh testfile.dat
Large area sequential serial reads of 64KB blocks diskspd -c2G -b64K -o1 -W60 -d60 -Sh testfile.dat
Large area sequential serial writes of 64KB blocks diskspd -c2G -w -b64K -o1 -W60 -d60 -Sh testfile.dat
Small area concurrent reads of 4KB blocks diskspd -c100b -b4K -o32 -F8 -T1b -s8b -W60 -d60 -Sh testfile.dat
Small area concurrent writes of 4KB blocks diskspd -c100b -w -b4K -o32 -F8 -T1b -s8b -W60 -d60 -Sh testfile.dat
Small area concurrent reads of 64KB blocks diskspd -c100b -b64K -o32 -F8 -T1b -s8b -W60 -d60 -Sh testfile.dat
Small area concurrent writes of 64KB blocks diskspd -c100b -w -b64K -o32 -F8 -T1b -s8b -W60 -d60 -Sh testfile.dat
Gather data about physical disk I/O events and memory events from NT Kernel Logger diskspd -eDISK_IO -eMEMORY_PAGE_FAULTS testfile.dat
Instruct NT Kernel Logger to use paged memory instead of non-paged memory and gather data concerning physical disk I/O events diskspd -eDISK_IO -ep testfile.dat
Run a DiskSpd and signal events when the actual test starts and finishes diskspd -ysMyTestStartedEvent -yfMyTestFinishedEvent testfile1.dat
Run a few separate instances of DiskSpd, but synchronize their start and stop times diskspd -o1 -t2 -a0,1 -yrMyStartEvent -ypMyStopEvent testfile1.dat

diskspd -r -t2 -a2,3 -yrMyStartEvent -ypMyStopEvent testfile2.dat

diskspd -Sh -t4 -a4,5,6,7 -yrMyStartEvent -ypMyStopEvent testfile3.dat

diskspd -yeMyStartEvent

rem After a few seconds

diskspd -yeMyStopEvent

Quelle: DiskSpd_Documentation

Leave a Reply