Tuesday, May 24, 2022

Learning some useful Windows OS Concepts

Windows  Operating system provides multiple technologies to develop distributed, scalable  Network based applications. Some not so common technologies and their usage is discussed below.

Overlapped IO 

  • Provide fastest asynchronous capability known to OS, enabling concurrent reads/writes by multiple threads. This comes free with .NET managed thread pool. 
    • Overlapped IO is useful when IOs are executed in parallel. For example, when an data file is assembled from strips received from UDP packets.
    • Operations such as ReadFile / WriteFile use OVERLAPPED structure parameter.
    • To perform asynchronous overlapped IO, files are opened in the CreateFile API with FILE_FLAG_OVERLAPPED flag.
    • In case of async, ReadFile / WriteFile calls might return immediately.
    • GetOverlappedResult API can be used to check status of the IO operation.

    IO Completion Port

    • IOCompletionPorts Provide fastest asynchronous IO capability known to OS, enabling concurrent reads/writes by multiple threads. 
      • IOCompletionPorts are useful to process results of multiple async IOs executed in parallel on multiple handles(file, socket, pipes etc) in a centralized place. For example, web server such as IIS.
      • To use IOCP functionality, first an IOCP needs to be created using CreateIOCompletionPort Api with an Invalid handle.
      • A thread pool needs to be created where multiple threads waits on the IOCP in QueuedCompletionStatus API to process completed IOs.
      • The handles on which async IOs are performed should be registered with the above IOCP. An unique Completion key can be also supplied.
      • Upon completion of asynchronous overlapped IOs on the above handles, one of the threads from the thread pool is released to process the results.
      • .Net library provides a separate IOCP thread pool. Classes such as FileStream internally use this to support async operations.
      • A good Winsock based example in C++ can be found here

      File Mapping backed by NTFS sparse file

      • Sparse data consists of data chunks containing 0s.
      • In a file system these occupy valuable disk space without adding much value.
      • NTFS introduces a new file type called sparse file.
      • The big advantage of sparse file is that when it contains sparse data, it’s not committed to disk as seen in the screenshot below.


      OS keeps track of these sparse data segments within in its internal data structures. When sparse data is
       read, OS returns an array of 0s instead of physically reading the file.
      • When real data is written to a sparse file, OS will update its internal data structures to reflect changes.
      • One of the disadvantages of shared memory is that either huge blocks of memory needs to be committed upfront or requires complicated programming to commit memory on the fly for the previously reserved address space.
      • NTFS sparse file backed memory mapped file enables accessing of a large address space without committing of disk space.
      • The OS transparently commits as much memory written when an application writes into the address space.
      • When a non written address space is accessed, faults are not raised; instead a block of 0s is returned.
      Usage Example
      • During acquisition data files generated vary in numbers and size. 
      • In other words, the total size of  data files generated can vary from 100MB to 1.5GB.
      • The storage is allocated in the shared memory results in underutilization since most of the committed pages contain sparse data thereby affecting overall performance of the system as well.
      • As an alternative if the storage is allocated in a memory mapped file, the disk space will be committed upfront. Again most pages will have sparse data.
      • A NTFS Sparse file backed MMF file eliminates both of these shortcomings since sparse data is never committed to disk.
      • As seen in the screenshot above, this difference can be seen.

      Source and Binaries can be found here.

      Wednesday, May 4, 2022

      Ultra fast compression and decompression Win32 APIs for realtime applications



      The NTFS file system internally uses Ultra fast Realtime data compression and decompression.
      Starting with Win10, it’s used internally by OS as shown below.

      The same APIs RtlCompressBuffer and RtlDecompressBuffer can be used by user applications as well.
      More info in the MSDN page.

      The tables below show data collected for first 10 data files during an acquisition run.
      Up to 38% of savings can be achieved over a standard 1.8 MB data file. Compression timings as low as 16 milliseconds and expansion timings as low as 15 milliseconds are clocked per data file.


      In an another example below,  a 5 MB DICOM image was compressed to almost 40% in 38 ms and decompressed in 8ms.

      TestApp  IM_0047

      Compression [fast]
      uncompressed size:      5509714 compressed size:        3378829 time in ms:38

      Decompression [fast]
      compressed size:        3378829 uncompressed size:      5509714 time in ms:8


      Compression [slow]
      uncompressed size:      5509714 compressed size:        3203440 time in ms:6235

      Decompression [slow]
      compressed size:        3203440 uncompressed size:      5509714 time in ms:8




      Source and Binaries can be found here.