Methods for Estimating
We want to recommend not to purchase Production hardware before Production user (software and business) requirement details are known. We do not know of any way to estimate hardware specifications for requirements that are not quantified in some tangible way. Every infrastructure estimation is based upon an explicit or implicit model of what the use cases will actually be.
Develop in Stages, Purchase in Stages
Ideally you will design your system to have three environments: Development, Test/QA and Production. We promote that you begin by investing in a Development environment.
The Development environment is not used for QA, or performance testing. It only needs to be configured to enable developers to do their work capturing user business requirements. During the initial development phase, business requirements are modeled, and the batch processes required to support them start to be understood. Through the iterations of the development process, the hardware configurations required to support processing requirements start to be objectified.
In this way, a Development environment helps empirically to define the requirements for a QA. And because QA needs to be as close to identical with Production as possible, the QA infrastructure defines what is required for Production.
This observation is accurate because of the strict demands of change management. It is not possible to predict the impact of a change in one architecture based on the impact that the change has on an architecturally different one. So QA needs to be a duplicate of Production. Opting to use a single box for Development and QA locks you in a paradox: You need to estimate your Production infrastructure specifications before you know your Production infrastructure requirements.
Unless, of course, you are comfortable faking it…
Generic Infrastructure Recommendations
Production performance issues invariably are the result of process contention for the same hardware resources. We measure, or estimate, the degree of this contention, and refer to it as concurrency. We can talk about concurrency as it applies individually to memory, processor, disk, network, etc. And the overall concurrency that any given system is able to support is determined by the throughput of the entire software and hardware infrastructure.
High-level and generic recommendations for infrastructures designed to support Essbase are:
1. Minimize processor related variations in performance by configuring Essbase to run using dedicated resources. Logical partitions (LPARs), or virtual environments, must be able to be controlled so that the Administrator is sure that each Essbase process has full access to the resources being assigned to it
2. Minimize storage I/O conflicts between Essbase and other 3rd party applications (e.g. Essbase vs. RDBMS), between different Essbase applications, and intra-cube I/O contention (e.g. multiple business rules, CALCPARALLEL and queries). In demanding Essbase environments, it is good practice to ensure that Essbase uses dedicated devices for storage. In extremely demanding processing situations, an individual cube might need to be provided dedicated storage.
3. Minimize memory conflicts by ensuring that processes have sufficient RAM resources to complete without the OS having to use virtual memory space.
Generic recommendations for methodologies to determine system requirements:
1. Performance engineering software simulations should be used to determine optimal hardware settings to support the Essbase server.
2. Performance engineering software simulations should be used to determine the optimal Essbase settings to support loading, exporting, querying and aggregating.
When attempting to estimate requirements to support Essbase Server, concurrency eventually means the simultaneous request for processor resources. Concurrency analysis can of course be applied to any infrastructure resource. It is not too inaccurate to think of bottlenecks essentially as where the concurrency rubber hits the infrastructure road.
When not able to perform a concurrency analysis using accurate simulations of end-user activities, sizing must be performed using estimates for concurrency.
Concurrency and Essbase
The classic way that concurrency is estimated derives from the total number of users of the system. This is probably because it is easy, and the number of users for a system is available very early in the application lifecycle. It is a perfectly legitimate way to begin to think about concurrency.
The number of total users is factored by a driver to estimate the number of connected users, and the connected user count is factored by another (perhaps the same?) driver to estimate concurrency. Ten percent (10%) appears to be the de facto standard used for this driver. So, for example, 3,000 named users represents 300 connected users and, ultimately, an estimated concurrency of 30.
How accurate is this for helping to size Essbase infrastructures? Recalling the above definition of concurrency as the “simultaneous request for processor”, the concurrency estimate of 30 turns out to be very large indeed.
In terms of Essbase, concurrency is best understood within the context of peak usage factored by response time. This is important because Essbase performance is invariably determined by a combination of the three characteristics of simultaneous requests, peak usage and response time.
Peak usage specifies concurrency across a period of time. We are looking to understand, in precise detail, the number of concurrent requests that are going being made of the processors, and for the length of time that these requests are expected to occur.
If we take our figure of 30 simultaneous requests, and add to that the duration of 4 hours, we end up with something like “peak usage for Essbase is that, at any point in time during our peak 4 hour period, 30 simultaneous requests are being made for processor”. When we factor the average response time for these requests, we begin to develop a conceptual framework for defining an infrastructure to support concurrency. How does response time factor into this?
For the sake of discussion, let’s inaccurately define “simultaneous” as a one second interval. We write “inaccurately” because we mean that concurrent requests theoretically are occurring whenever we snapshot the Essbase Server. Using one second as the default interval is useful because it simplifies an accurate estimation of concurrency.
In our example, peak usage now means “every second for four continuous hours, 30 new requests for processor are being made by users”. If the average response time is in the order of 5 seconds (and five seconds is a very optimistic response time given the current trend of customer uses of Essbase), then we have a potential problem because by the time that the first 30 requests have been processed, we have an application where slightly less than 30 * 5 more processing requests have been generated. The result is that we have a queue of just under 150 requests occurring within the first 5 seconds of peak runtime. And this activity is expected to happen for four solid hours.
What kind of server can reasonably be expected to be able to perform under such a workload?
Definition of Concurrency
We broaden the definition of concurrency to be “the number of requests for processor that can occur within an average response time”. This helps delineate the relation between response time and concurrency: Concurrency varies directly with response time.
For example, we recently reviewed a customer environment for performance. The average baseline query response time (~25 seconds) was measured before and then during while increasing workload demands were made of the server. The average response time jumped from 25 to over 450 seconds before having to be stopped. The test was able to be run for slightly less than 30 minutes. The connected user count was only ~5% of the actual anticipated connected user volume.
How would this scenario have been described if this were an actual Production run rather than a simulation? “We saw more or less acceptable performance for a few minutes this morning before the entire system became totally, completely unresponsive…”
Hopefully that comment isn’t familiar to very many readers. Sadly, it is possible for an improperly configured Essbase Server environment to overwhelm powerful server infrastructures.
When thinking about defining a server infrastructure for Essbase, it is necessary to conceive concurrency as having two characteristics:
1. requests for processor
2. average response time
Expressing the relation scientifically we get:
We are even tempted to construct the following formula for computing concurrency:
Concurrency = Requests * Response_Time
At the level of understanding desired here, processor concurrency refers to the number of requests that occur within the average peak response time. This explains how concurrency and performance vary over time.
During non-peak periods, the number of requests for processor is low, average resulting response times are low, and so too is concurrency. During peak periods, on the other hand, when the number of requests for processor is high, both the average response time and concurrency increase.
The worst case for customers occurs when requests are so high as to dramatically reduce the number of available processors per request. Response times become so attenuated that the entire application becomes unresponsive.
The unresponsiveness might, however, be the expected behavior:
"Because computations in a concurrent system can interact with each other while they are executing, the number of possible execution paths in the system can be extremely large...Concurrent use of shared resources can be a source of indeterminacy leading to...starvation."
When either the number of requests for processor or the average response time is under-estimated, the accuracy of the proposed infrastructure specifications is undermined.
In our final entry will discuss Essbase processes and present methods for estimating server requirements.