next up previous
Next: The solution Up: BATCHING: A Design Pattern Previous: Introduction

The problem

Both cross-domain data traffic and cross-domain call latency have a significant impact on the efficiency of multi-domain applications. Cross-domain calls and cross-domain data transfers also happen on centralized environments. For instance, almost every operating system has a domain boundary between user space and kernel space (both entering and leaving the kernel requires a domain crossing). An application using multiple processes has a domain boundary between every two of its processes. Besides, in a distributed system, the network behaves as a domain boundary.

The line separating two different domains has to be considered while designing the application. There are two main issues causing problems to any application crossing the line: data movement and call latency.

Within a protection domain (e.g., an unix process), an object can pass data efficiently to any other object. For passing a large amount of data, a reference can be used. However, whenever an object has to pass some piece of data to another object at a different domain, data has to be copied. (Although some zero-copy networking frameworks avoid data copying within a single node in a network, data still has to be ``copied'' through the network in distributed applications.)

On many application domains, like file systems and databases, data movement can be the actual performance bottleneck for the application. Therefore, avoiding unnecessary data transfer operations can be crucial.

Under many circumstances, unnecessary data transfers occur just because the object controlling the operation resides far from the data source and/or the data sink. That is precisely what happens in the file copy example in the previous section: the client object performing the copy and the file server objects were placed at different domains. Thus, data came to the client just to go back to the server.

Another issue is call latency. A call between two objects residing at different domains is much more expensive than a typical method call within a single domain. The reason is simply that a domain boundary has to be crossed; that usually involves either the operating system kernel (in a single node), network messaging (in a distributed environment) or both.

Therefore, avoiding domain crossing when performing calls is crucial for performance. Any solution reducing the number of domain crossings can make the application run faster.

When designing a solution, it should be taken into account that, under certain circumstances (e.g. when cheap domain crossing is available and efficiency is your primary objective), the overhead introduced to solve the problem might actually degrade performance. However, even when cheap domain crossing is available, overhead caused by cross-domain data transfers (e.g. copying data or sending messages over a network) might still cause a performance problem.

Any solution must take into account carefully what is the real penalty caused by data copying and call latency, and it should be employed only when the overhead it causes is small enough compared to the penalties avoided.

next up previous
Next: The solution Up: BATCHING: A Design Pattern Previous: Introduction
Francisco J Ballesteros