Home for HMNL Enterprise Computing

Autonomic Throughput Optimisation

Ian Tree  28 August 2013 12:33:35

Autonomic Throughput Optimisation

The Domino eXplorer (DX) was developed as a means for facilitating the rapid development of C++ tools to be used in projects that involve high volumes of data transformation. DX has been, and continues to be developed for use across a wide range of Domino versions and platforms. The tool set is also appropriate for Business Intelligence applications that have to process "Big Data" in Notes Databases. DX is also used as a research tool to investigate various aspects of Autonomic Systems, in particular Autonomic Throughput Optimisation.

Using Domino eXplorer (DX) applications in typical "Heavy Lift Computing" projects can involve a lot of performance tuning in order to get applications to meet target throughput objectives, this can be a technically challenging and time consuming task.

The goal of the research is find a model and algorithms that would allow the DX kernel to maximise the throughput of an application autonomically independent of the current workload profile and the state of the execution environment. The kernel uses a very simplistic component model to explore different aspects of throughput optimisation. In the model the application is viewed as an agent that generates a stream of requests that represent a unit of workload, the application passes these request to the kernel for execution, the kernel executes these workload requests on behalf of the application, the execution of the requests result in resources being consumed from the execution environment. The following properties influence application throughput in the model.

1.        Application design.
2.        The profile of the units of workload presented in each request.
3.        Constraints, contention and limits on resource consumption from the execution environment.
4.        The multiprogramming level in the kernel.

Application Design

The kernel will influence that application design through the API that it exposes to the application. Obviously the application design cannot be varied dynamically during execution so this property is not taken into consideration for autonomic optimisation.

Unit of Workload Profile

While the application and kernel have little influence over the profile of a primitive unit of workload, in terms of wait states, I/O demand, Memory demand and CPU demand we already recognise in the DX3 kernel that each request that is submitted for execution should multiplex a number of primitive units of workload into each request that is passed to the kernel. Some applications that use the DX3 kernel will determine an appropriate multiplex level at execution time according to different factors. The determination of the multiplex level is controlled in the application layer and is not communicated to the kernel.

The current research version of the kernel DX3R takes a slightly different approach that requires only minimal code change in the application layer. All requests are submitted to the kernel containing only a single primitive unit of workload, the kernel builds these unit requests into "trains" this mechanism replaces the multiplexing currently performed in the application layer. The application must indicate to the kernel that the request being submitted can be multiplexed with the previous request. The execution code iterates over the train of requests assembled by the kernel rather than the collection of primitive units of workload contained in the request.

Execution Environment

Operating systems already have enough problems trying to understand and manage the constraints, contention and limits of resource consumption in the execution environment. The DX kernel makes no attempt to measure nor to manage any aspects of the behaviour of the execution environment. However the kernel is aware that changes within the execution environment will directly affect optimal throughput levels.

Multiprogramming Level

The current DX3 production kernel does not dynamically vary the number of worker threads in the thread pool that are being used to satisfy the requests being generated by the application, the number is determined by the application (usually through a command line parameter). The research DX3R kernel does however has the capability to actively manage the number of available threads on-demand.

Current Research Model

The current research model can manipulate both the multiprogramming level and the size of request "trains" (unit of workload multiplex level). The kernel looks at the recent history of throughput which is measured in terms of the number of primitive units of workload completed in a unit of time and adjusts the workload multiplex level and the multiprogramming level in order to try and find the optimal throughput level and maintain optimal throughput in reaction to changes in the workload characteristics and execution environment. The kernel expects that for any application at a particular point in time, and for a limited time interval there will exist a "sweet spot" at which the settings will maximise throughput, as illustrated below.

Sweet Spot

Algorithms are currently being investigated that continuously search for the sweet spot based on the recent history of settings of the workload multiplex and multiprogramming levels in combination with the recent history of throughput achievement.