Windows Operating system provides multiple technologies to develop distributed, scalable Network based applications. Some not so common technologies and their usage is discussed below.
Overlapped IO
- Provide fastest asynchronous capability known to OS, enabling concurrent reads/writes by multiple threads. This comes free with .NET managed thread pool.
- Overlapped IO is useful when IOs are executed in parallel. For example, when an data file is assembled from strips received from UDP packets.
- Operations such as ReadFile / WriteFile use OVERLAPPED structure parameter.
- To perform asynchronous overlapped IO, files are opened in the CreateFile API with FILE_FLAG_OVERLAPPED flag.
- In case of async, ReadFile / WriteFile calls might return immediately.
- GetOverlappedResult API can be used to check status of the IO operation.
IO Completion Port
- IOCompletionPorts Provide fastest asynchronous IO capability known to OS, enabling concurrent reads/writes by multiple threads.
- IOCompletionPorts are useful to process results of multiple async IOs executed in parallel on multiple handles(file, socket, pipes etc) in a centralized place. For example, web server such as IIS.
- To use IOCP functionality, first an IOCP needs to be created using CreateIOCompletionPort Api with an Invalid handle.
- A thread pool needs to be created where multiple threads waits on the IOCP in QueuedCompletionStatus API to process completed IOs.
- The handles on which async IOs are performed should be registered with the above IOCP. An unique Completion key can be also supplied.
- Upon completion of asynchronous overlapped IOs on the above handles, one of the threads from the thread pool is released to process the results.
- .Net library provides a separate IOCP thread pool. Classes such as FileStream internally use this to support async operations.
- A good Winsock based example in C++ can be found here
File Mapping backed by NTFS sparse file
- Sparse data consists of data chunks containing 0s.
- In a file system these occupy valuable disk space without adding much value.
- NTFS introduces a new file type called sparse file.
- The big advantage of sparse file is that when it contains sparse data, it’s not committed to disk as seen in the screenshot below.
OS keeps track of these sparse data segments within in its internal data structures. When sparse data is
read, OS returns an array of 0s instead of physically reading the file.
- When real data is written to a sparse file, OS will update its internal data structures to reflect changes.
- One of the disadvantages of shared memory is that either huge blocks of memory needs to be committed upfront or requires complicated programming to commit memory on the fly for the previously reserved address space.
- NTFS sparse file backed memory mapped file enables accessing of a large address space without committing of disk space.
- The OS transparently commits as much memory written when an application writes into the address space.
- When a non written address space is accessed, faults are not raised; instead a block of 0s is returned.
Usage Example
- During acquisition data files generated vary in numbers and size.
- In other words, the total size of data files generated can vary from 100MB to 1.5GB.
- The storage is allocated in the shared memory results in underutilization since most of the committed pages contain sparse data thereby affecting overall performance of the system as well.
- As an alternative if the storage is allocated in a memory mapped file, the disk space will be committed upfront. Again most pages will have sparse data.
- A NTFS Sparse file backed MMF file eliminates both of these shortcomings since sparse data is never committed to disk.
- As seen in the screenshot above, this difference can be seen.