Robin and I began our investigation by loading the data on our own 32 bit Essbase version - to see what we could see, as it were. The load completed in about ~4900 seconds.
As I monitored the database statistics using esscmd, I noticed fragmentation statistics. The data was not sorted. (Not sorted?!? Good grief! Robin and I hadn’t seen that in a decade). When we performed a load with sorted data it completed in ~900 seconds. Repeating the customer’s load procedures, we noticed that data loaded efficiently up to approximately ~70,000 blocks but bogged down considerably from that point until all ~350,000 blocks were created.
When I mentioned fragmentation/sorting to the customer it was clear from his initial response to me that he was not entirely aware of the relation between sorting data and loading data to a BSO cube. Rather, he stated that he did not really care whether the load was optimal because a 20‑minute data load time fits easily within his batch window. Given that my test results did not speak to the issue of 64-bit load times, we agreed to convene the next workday, online, to examine the issue together.
After we confirmed every setting between the two cubes environments were identical, we initiated a load on the 64-bit server. We killed the data load after it was clear that it had bogged down, and began to crawl. It would have been more eventful to sit and watch the grass grow. Since watching Windows performance monitor is even more fun than watching grass grow, we opened up a session there - to see what we could see.
CPU activity was down at that grass growing pace. When we investigated further we noticed there was a disk bottleneck by observing the average disk queue metric. After the initial (~70K)data blocks were created, the queue dramatically increased per the image below.
Compare the above chart to the queue that is created during the sorted data load:
Checking the commit level (default 3000) caused me to experience internally the sound of a fingernail scratching across a blackboard. The reason that this was bothering me is that the negative impact of unsorted data was being exacerbated by the commit level. Together for this customer, they caused some sort of code issue to rear an ugly head in the 64‑bit version.
Say what? Here is what I think is happening.
As blocks are being created and handed over to the Windows OS to be written, new data is being read from the source files. That the data is not sorted means that Essbase may have to fetch data already sent to disk in order to update that block with new (unsorted) input data.
What happens when the block fetch request is in fact still in the next batch of 3000 blocks waiting to be committed? I think that Windows will need to commit the block to disk before it can respond to the fetch request. In the meantime, the data load has stopped, and the CPUs are sitting around, probably having a smoke.
I don’t think it takes a rocket scientist to speculate that this situation creates a disk I/O nightmare that eventually impacts CPU utilization. The unacceptable disk queue is ultimately the result of not sorting the source data.
Notwithstanding the fact that something may be seriously wrong with the 64-bit Essbase data load routines, the disk I/O bottleneck really is occurring on the 32‑bit server, but only to a lesser extent.
On the customer’s test environment we were able to use 64‑bit Essbase to load the unsorted data in approximately 500 seconds. This was accomplished simply by setting the commit block parameter to 0 (zero) causing Essbase to delay commits until the load had successfully completed. Be aware that not all customer Essbase environments can accommodate this setting.
You tell me why anybody would ever consider ensuring that their Essbase server is optimally tuned and configured? If it ain’t broked, don’t fix it! Right?
How many customers out there have Essbase environments that are just waiting to self destruct because the next change to processing requirements happens?
Next month, I hope to do a web presentation to show what to look for in their Essbase application logs to determine whether they need to tweak tune, perform open heart surgery, or perhaps order a coffin. Interested?