Showing posts with label Sizing. Show all posts
Showing posts with label Sizing. Show all posts

Thursday, April 1, 2010

Infrastructure Sizing for Essbase (part 3)

This entry concludes our discussion of estimating Essbase infrastructure specifications.


Understanding Essbase Processes

Batch Processing
There are two components to Essbase batch refresh processes. Meta data and data are updated and then data is pre-processed before being made available to users. Pre-processing (i.e. aggregation creation) is where the Administrator designs how Essbase computes and stores cube values in the effort to enhance query responsiveness. Computing and persisting data is demanding of processor, network and disk resources: During aggregation creation, Essbase systematically reads data in, calculates/aggregates, and writes data out, sometimes for extended (hours) periods of time.

Resource demands and best practices require that batch processing be carried out in isolation from end-user query activity. Batch routines are usually run this way so that the Admin can be sure that required server resources are able to be allocated to specific Essbase processes.

Running Batch in Isolation
When the batch processes are being run isolated, end user activities are temporarily suspended. Isolating events in time is the easiest way to ensure that the resources that are required to perform the batch processes are available without restriction. From an infrastructure perspective what is required is that the server be configured with sufficient resources to process the batch routines efficiently. From the user perspective, however, it means that there is potentially a significant period of time when the application is not available. Whereas this involves a minimal requirement for hardware, end user downtime is becoming less and less tolerable for enterprise analytic applications.

Running Batch in Parallel
Running in parallel means batch processes are run coincident with end-user activities. The architecture is configured with sufficient resources to enable efficient batch processing without any significant degradation in performance of end user activities. This minimizes the impact on the end user, but it maximizes infrastructure requirements. In this scenario there are possibly great periods of time when the infrastructure is under‑utilized.

Underestimating resources here has two unfortunate characteristics; most of the time the system appears under‑utilized. During times of peak use, however, the whole system becomes a bottleneck for batch as well as user processes.

Peak usage always coincides with the delivery of critical functionality. Failure during peak periods will undermine user confidence in the entire application.

Query Processing
The quality of the end-user experience of Essbase is defined by query performance characteristics. Query performance depends upon three factors:

· CPU availability - a query is a single threaded event and thus requires one CPU
· Query Size - the amount of data that Essbase is required to retrieve from disk
· Dynamic Processing - the number and degree of dynamic computations


The objective of developing an efficient Essbase computing environment is to create one where the resource requirements to support batch processes are not simultaneously competing for resources required for end-use querying. Large queries that take minutes to complete have a much greater chance of being concurrent with other processing requests. Tuning the cube to have more efficient response times mitigates this. However, at some point, the query response time bottom line is reached, and hardware appears as the only way to increase performance. This is achieved different ways. Bigger and faster hardware is augmented by scaling the environment horizontally across multiple different servers, or by scaling the environment vertically on servers, or both.

Storage Requirements
Oracle will recommend customers use dedicated storage whenever possible for Essbase. Many times performance limitations on Essbase will be Disk as well as CPU related. Applications calculating and reporting data which will drive high I/O utilization on a storage system. For SAN devices, dedicated fiber connections will be very beneficial. This configuration will be helpful specifically when calculating multiple applications simultaneously. While not necessary, additional throughput performance gains are achieved by splitting Block Storage Option applications to calculate and store the index and data files on different physical drives and controllers.

For an Aggregate Storage Option cube, data is stored in both the Temp and Default directories during data load and aggregation processing. Using separate physical drives and I/O channels for the location of these files, you can reduce I/O contention and there by improve overall performance. Breaking up the Aggregate .DAT files by storing large databases on different physical drives with different I/O channels can also improve query performance.

Network Speed and Bandwidth
Using a full duplex 1GB Network connection between any servers in the same zone that have EPM components that communicate is optimal. If all the EPM components are on one physical server then use a corresponding server network connection up to the maximum speed of the network throughput of routers to the network.

Other Performance Related Considerations

Multi-Threading Technologies
Multi-Threading processor technologies spawn two threads per physical CPU to emulate multiple processing cores. This way, the OS perceives twice as many logical CPUs as there are physical ones. The result can be, for some applications, to improve throughput by as much as ~30%.

Not all software applications are enhanced by the use of these technologies. In general it has been found that Essbase performance is not enhanced, and often demonstrates degraded performance.

Essbase performance is, however, application and cube dependent. A small number of customers anecdotally report increased responsiveness using multi-processor simulations. Considering that there are platform implementation differences, and that it is not possible to predict accurately how any given cube will respond to this technology, it is best to test to determine what is appropriate for your Essbase environment.

Multiple Cores and Floating Point Units (FPU)
This is perhaps an outdated topic because of recent trends in processor technologies. Essbase performs floating point arithmetic, so single FPU processors represent a bottleneck for Essbase. To take advantage of the performance gains from multiple-core technologies, ensure that each core is configured with its own FPU.

LPAR/VM Definitions
Virtual machine and some LPAR definitions of Essbase servers are being used to share resources. Whereas these configurations are supported by Oracle for Essbase, the nature of a shared environment is such that it is technically challenging to control server resources to ensure especially what processor resources are being allocated to Essbase. The result is that Essbase performance characteristics cannot be controlled to deliver a consistent level of performance. Where it is not possible to be sure that the server will be able to meet the end-user service level agreements, we recommend that you decide against using Essbase in a shared environment.

Estimating Core and RAM Requirements
If the requirement persists to estimate infrastructure specifications without processing requirement details being known, here are two methods that can be used. Both make assumptions about how to gauge concurrency. The second includes adding cores to support batch processing requirements and factors in a desired response time for end user querying.

Processor Cores per Application based on Active Users
This method is easy, but potentially expensive. When use-case scenarios are not well known, estimate processor and memory requirements the following way: Allocate six cores per Essbase application. Memory is allocated by adding 2GB of RAM to the server for each core.

The use-case scenario that this rule was devised for is an Essbase and Relational reporting environment with 2000 named users where 500 users are active. In large computing environments, this can easily lead to very pessimistic (i.e. expensive) estimates for core and RAM requirements.

Users per Core
Another option is to base the number of cores on the number of users, using batch concurrency and query response time assumptions. RAM estimation allocates between 2 and 4 GB of RAM per application rather than per core.

The examples assume that batch routines are run in parallel with end-user processes. For an application that batch routines are run in isolation, core estimates should be reduced accordingly.
The examples that follow are experienced estimates that can only be validated with realistic load testing. Adding to the overall complexity of using this method is to keep in mind how important report design is for performance, RAM and core requirements.

50 Users Per Core.
The method assumes a desired 15 second response time for queries/reports with the longest query taking no longer than 30 seconds. To this base number of cores, you add cores for required for parallel batch processing.

25 Users Per Core
This method assumes a desired 15 second response time for queries/reports where minimal increases in response times are required. Add an additional core for each report that runs for 60 seconds or more. To this base number, add the number of cores required for parallel batch processing to compute the total number of estimated cores.
______________________________________________________________________________________

John French

Principal Service Delivery Engineer
Oracle Advanced Customer Services - GLOBAL
Jensen Beach, FL

Richard (Rick) Sawa
Principal Service Delivery
Oracle Advanced Customer Services - GLOBAL
Columbus, OH

Thursday, March 25, 2010

Infrastructure Sizing for Essbase (part 2)

Our previous entry was a discussion preliminary to infrastructure sizing. Below we discuss generic considerations and present a more precise definition of processor concurrency and Essbase.

Methods for Estimating
We want to recommend not to purchase Production hardware before Production user (software and business) requirement details are known. We do not know of any way to estimate hardware specifications for requirements that are not quantified in some tangible way. Every infrastructure estimation is based upon an explicit or implicit model of what the use cases will actually be.

Develop in Stages, Purchase in Stages
Ideally you will design your system to have three environments: Development, Test/QA and Production. We promote that you begin by investing in a Development environment.

The Development environment is not used for QA, or performance testing. It only needs to be configured to enable developers to do their work capturing user business requirements. During the initial development phase, business requirements are modeled, and the batch processes required to support them start to be understood. Through the iterations of the development process, the hardware configurations required to support processing requirements start to be objectified.

In this way, a Development environment helps empirically to define the requirements for a QA. And because QA needs to be as close to identical with Production as possible, the QA infrastructure defines what is required for Production.

This observation is accurate because of the strict demands of change management. It is not possible to predict the impact of a change in one architecture based on the impact that the change has on an architecturally different one. So QA needs to be a duplicate of Production. Opting to use a single box for Development and QA locks you in a paradox: You need to estimate your Production infrastructure specifications before you know your Production infrastructure requirements.

Unless, of course, you are comfortable faking it…

Generic Infrastructure Recommendations
Production performance issues invariably are the result of process contention for the same hardware resources. We measure, or estimate, the degree of this contention, and refer to it as concurrency. We can talk about concurrency as it applies individually to memory, processor, disk, network, etc. And the overall concurrency that any given system is able to support is determined by the throughput of the entire software and hardware infrastructure.

High-level and generic recommendations for infrastructures designed to support Essbase are:

1. Minimize processor related variations in performance by configuring Essbase to run using dedicated resources. Logical partitions (LPARs), or virtual environments, must be able to be controlled so that the Administrator is sure that each Essbase process has full access to the resources being assigned to it

2. Minimize storage I/O conflicts between Essbase and other 3rd party applications (e.g. Essbase vs. RDBMS), between different Essbase applications, and intra-cube I/O contention (e.g. multiple business rules, CALCPARALLEL and queries). In demanding Essbase environments, it is good practice to ensure that Essbase uses dedicated devices for storage. In extremely demanding processing situations, an individual cube might need to be provided dedicated storage.


3. Minimize memory conflicts by ensuring that processes have sufficient RAM resources to complete without the OS having to use virtual memory space.

Generic recommendations for methodologies to determine system requirements:

1. Performance engineering software simulations should be used to determine optimal hardware settings to support the Essbase server.
2. Performance engineering software simulations should be used to determine the optimal Essbase settings to support loading, exporting, querying and aggregating.

When attempting to estimate requirements to support Essbase Server, concurrency eventually means the simultaneous request for processor resources. Concurrency analysis can of course be applied to any infrastructure resource. It is not too inaccurate to think of bottlenecks essentially as where the concurrency rubber hits the infrastructure road.

When not able to perform a concurrency analysis using accurate simulations of end-user activities, sizing must be performed using estimates for concurrency.

Concurrency and Essbase
The classic way that concurrency is estimated derives from the total number of users of the system. This is probably because it is easy, and the number of users for a system is available very early in the application lifecycle. It is a perfectly legitimate way to begin to think about concurrency.

The number of total users is factored by a driver to estimate the number of connected users, and the connected user count is factored by another (perhaps the same?) driver to estimate concurrency. Ten percent (10%) appears to be the de facto standard used for this driver. So, for example, 3,000 named users represents 300 connected users and, ultimately, an estimated concurrency of 30.

How accurate is this for helping to size Essbase infrastructures? Recalling the above definition of concurrency as the “simultaneous request for processor”, the concurrency estimate of 30 turns out to be very large indeed.

In terms of Essbase, concurrency is best understood within the context of peak usage factored by response time. This is important because Essbase performance is invariably determined by a combination of the three characteristics of simultaneous requests, peak usage and response time.

Peak usage specifies concurrency across a period of time. We are looking to understand, in precise detail, the number of concurrent requests that are going being made of the processors, and for the length of time that these requests are expected to occur.

If we take our figure of 30 simultaneous requests, and add to that the duration of 4 hours, we end up with something like “peak usage for Essbase is that, at any point in time during our peak 4 hour period, 30 simultaneous requests are being made for processor”. When we factor the average response time for these requests, we begin to develop a conceptual framework for defining an infrastructure to support concurrency. How does response time factor into this?

For the sake of discussion, let’s inaccurately define “simultaneous” as a one second interval. We write “inaccurately” because we mean that concurrent requests theoretically are occurring whenever we snapshot the Essbase Server. Using one second as the default interval is useful because it simplifies an accurate estimation of concurrency.

In our example, peak usage now means “every second for four continuous hours, 30 new requests for processor are being made by users”. If the average response time is in the order of 5 seconds (and five seconds is a very optimistic response time given the current trend of customer uses of Essbase), then we have a potential problem because by the time that the first 30 requests have been processed, we have an application where slightly less than 30 * 5 more processing requests have been generated. The result is that we have a queue of just under 150 requests occurring within the first 5 seconds of peak runtime. And this activity is expected to happen for four solid hours.

What kind of server can reasonably be expected to be able to perform under such a workload?

Definition of Concurrency
We broaden the definition of concurrency to be “the number of requests for processor that can occur within an average response time”. This helps delineate the relation between response time and concurrency: Concurrency varies directly with response time.

For example, we recently reviewed a customer environment for performance. The average baseline query response time (~25 seconds) was measured before and then during while increasing workload demands were made of the server. The average response time jumped from 25 to over 450 seconds before having to be stopped. The test was able to be run for slightly less than 30 minutes. The connected user count was only ~5% of the actual anticipated connected user volume.

How would this scenario have been described if this were an actual Production run rather than a simulation? “We saw more or less acceptable performance for a few minutes this morning before the entire system became totally, completely unresponsive…”

Hopefully that comment isn’t familiar to very many readers. Sadly, it is possible for an improperly configured Essbase Server environment to overwhelm powerful server infrastructures.

Processor Concurrency
When thinking about defining a server infrastructure for Essbase, it is necessary to conceive concurrency as having two characteristics:

1. requests for processor
2. average response time

Expressing the relation scientifically we get:

Concurrency ∝ Response_Time

We are even tempted to construct the following formula for computing concurrency:

Concurrency = Requests * Response_Time

At the level of understanding desired here, processor concurrency refers to the number of requests that occur within the average peak response time. This explains how concurrency and performance vary over time.

During non-peak periods, the number of requests for processor is low, average resulting response times are low, and so too is concurrency. During peak periods, on the other hand, when the number of requests for processor is high, both the average response time and concurrency increase.

The worst case for customers occurs when requests are so high as to dramatically reduce the number of available processors per request. Response times become so attenuated that the entire application becomes unresponsive.

The unresponsiveness might, however, be the expected behavior:

"Because computations in a concurrent system can interact with each other while they are executing, the number of possible execution paths in the system can be extremely large...Concurrent use of shared resources can be a source of indeterminacy leading to...starvation."
(http://en.wikipedia.org/wiki/Concurrency_(computer_science)#cite_note-cleaveland1996-0)

When either the number of requests for processor or the average response time is under-estimated, the accuracy of the proposed infrastructure specifications is undermined.

____________________________________________________________________


In our final entry will discuss Essbase processes and present methods for estimating server requirements.


John French
Rick Sawa

Friday, March 12, 2010

Infrastructure Sizing for Essbase (part 1)

This is the first of a series of three blog entries where we discuss estimating infrastructure specifications to support Essbase.


John French
ACS Principal Service Delivery Engineer

Richard (Rick) Sawa
ACS Principal Service Delivery Engineer



Overview
We provide a high-level discussion of what is involved in assessing/sizing a server infrastructure to support Essbase. We start by briefly outlining the requirements and procedures involved with assessing an existing environment. This establishes a frame of reference for the discussion that follows on guidelines for estimating infrastructures to support Essbase when the details of end-user and batch processing requirements have yet to be defined.

Introduction
There is no replacement for systematic testing to determine the specific hardware specifications required to support Essbase, no matter how hard the shoe strikes the podium. We think that everyone knows that this is true. And it’s also true that at the very beginning of a new development initiative, the proverbial cart is before the horse. How does one define hardware specifications for processing requirements that are not yet quantified? The short answer is that you can’t.

In the absence of requirements, every estimation for hardware is based on assumptions.

Essbase Server Assessments
We frame the sizing discussion by briefly presenting how we evaluate existing Essbase servers when processing requirements are fully understood. When the results of the assessment reveal that the infrastructure is found wanting, an estimation of more appropriate server specifications can be brought forward. The criteria used to draw up these new specifications forms an ideal list of criteria for assessing Essbase infrastructures.

Once ideal requirements are understood, you will be able compare them to what is available on more generic assessments. Subtracting the generic criteria from the ideal gives an indication of how accurate the sizing estimate can be expected to be.


Assessment Checklist
The following is a summary list of the objects and information that eServices review in order to complete an Essbase infrastructure server assessment:

1. Essbase Server Configuration
a. Essbase.cfg Settings
2. Essbase Application Settings
a. Application Logs
b. Cube Outlines
c. Cube Statistics
d. Calc Script/Business Rules Procedural Logic and Settings
e. Batch Process Scripts
3. Hardware Server Configuration
a. Operating System
b. Processors (number, speed & architecture)
c. RAM
d. Virtual Machine configuration
e. LPAR definition
f. Server Application profile
g. Disk configuration
h. Network configuration
4. Server Performance Monitoring Logs
a. RAM
b. CPU
c. Network
d. Disk

Items 1 and 2 contain detailed software requirements. These infer specifications for hardware listed in item 3. The first two items really provide specific sizing criteria for hardware.

In an in situ environment, items 1, 2 and 3 are already working together, and have specific content. A sizing assessment where software requirements are minimally known means that assumptions need to be provided. The accuracy of the sizing estimate is strictly correlative with the accuracy of these assumptions.


Assessment Components
Essbase Server
Essbase objects are analyzed for settings (caches, CALCPARALLEL, and so on). Requests for processor, network and disk resources are extracted from the Essbase Application logs in the form of response times for events. Response times are combined manually to provide a single Essbase Performance Log.

Essbase Optimization
Every Essbase server review should look at the Essbase cube designs to determine whether they are following best practices, and whether tuning methodologies can be invoked to increase performance.

Complete application design reviews involve coordinating the detailed business requirements with cube design decisions. Full reviews vary in no significant way from an implementation in terms of the amount of time and resources that they consume. This usually stands far outside of what is possible to do within the timeframes allocated for an assessment.

Once, however, the cubes and their processes have been optimized within time and resource constraints, a more reliable determination of hardware requirements can be made. Sometimes a tuning effort is sufficient to enable the system to perform up to service level agreements, and sometimes not.

In our opinion, tuning is mandatory because it averts the criticism that hardware is simply being thrown at the problem.

Supporting Infrastructure Components
The Essbase configuration and script settings are cross-referenced with infrastructure settings and configuration. The infrastructure (RAM, CPU, etc.) is monitored and measured during Essbase processing.

Infrastructure Analysis
Concurrency is accurately extracted from the Essbase performance logs by identifying overlapping response times. The contents of the manually generated Essbase performance log are correlated with infrastructure performance log statistics, and subjected to analysis.

Correlating the Server Performance Monitoring Logs with Essbase events, enable you to compare what is being allocated to Essbase processes with how the underlying server hardware, operating system and supporting infrastructure components are behaving.

Consider the two following charts created during an infrastructure assessment. They show the saw tooth behavior of both disk and CPU activity. Comparing teeth directly, clearly evident is an inverse relationship between disk and CPU. Vertical lines have been inserted to illustrate:


When disks were busy, CPUs became idle, and vice versa. The activities being measured were data load and aggregation batch process routines. From this we were able to see the disk bottleneck and the impact that it was having on CPU utilization.

This type of measurement makes it possible to assess server behavior, and can be incorporated to provide accurate infrastructure specification criteria.

To sum up, in situ infrastructure assessments analyze detailed Essbase and infrastructure metrics to determine how and why the infrastructure is responding to specific Essbase processing requirements. An analysis is made of Essbase design characteristics, and tuning techniques are applied to ensure that Essbase processes are as efficient as possible. The analysis of Essbase settings and processing requirements enables an accurate estimation of hardware should the current infrastructure be found wanting.

In the final analysis, a complete list of Essbase settings and processing requirements are requisite to estimating infrastructure requirements.
________________________________________


Next week we will continue with a general discussion sizing concepts and focusing on how to best understand concurrency and Essbase.

Official, Youbetcha Legalese

This blog is provided for information purposes only and the contents hereof are subject to change without notice. This blog contains links to articles, sites, blogs, that are created by entities other than Oracle. These links may contain advice, information, and opinion that is incorrect or untested. This blog, links, and other materials contained or referenced in this blog are not warranted to be error-free, nor are they subject to any other warranties or conditions, whether expressed orally or implied in law, including implied warranties and conditions of merchantability or fitness for a particular purpose. We specifically disclaim any liability with respect to this blog, links and other materials contained or referenced in this blog, and no contractual obligations are formed either directly or indirectly by this blog, link or other materials. This blog may not be reproduced or transmitted in any form or by any means, electronic or mechanical, for any purpose, without our prior written permission. The opinions and recommendations contained in this blog(including links) do not represent the position of Oracle Corporation.

Oracle, JD Edwards, PeopleSoft, and Siebel are registered trademarks of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners.