Convergence: SQL Server and Win. Perf. Counters That Matters

Waldo · 2007-03-23

Luc and kriki asked me to post the convergence sessions about sql performance in this forum...

What to start looking at ... .
Why should you tune your database?
Loss of user productivity
Cust of new hardware
Harware may only temporarily resolve
Understanding load being placed on your server
...
Seems logical, doesn't it, but try to explain to your customes that the application should be 'tuned' towards the usage of the application... .

The hardware restricting performance, is usually: Disk I/O, CPU, Memory and Network.
Most important: The Disk I/O. Some general recommendations:
- Use Physical disk instead of logical disk
- Sec/trasfer should be less than 15ms
- Transfers/sec should be less than 120 per disk
- Current disk queue length should be less than (2* # disks)
- Disk bytes/sec should be less than (10MB/sec per disk)
You should use the profiler, together with these counters, and see what it gives... .

Now, he began explaining about the RAID volumes. The main message: RAID 5 has got a poor write performance ... RAID 10 has got a high read performance. So: the .mdf is best but on RAID 10. The .log can go on RAID 1, but if shown pressure, RAID 10. Don't put the log on RAID 5. The tempdb is best put on a RAID 1, but if shown pressure, RAID 10. The Master should be put on RAID 1.

The causes of high I/O could be:
Memory pressure / Excessive paging / Severe blocking caused backlog of I/Os / Poorly designed queries (missing key indexes) (important for NAV!) / RAID 5 .

Demo time:
Just showing a "busy" machine with the performance counters above. Random performance issues: probably blocking.

Second thing to be looking into is CPU. Interesting counters are:
% processor time < 80%
% privileged time < 10%
Processor queue length < 2
Context switches/sec < 1000
SQL re-compilations/sec (lower is better)

The causes of CPU bottlenecks could be:
Compiles/recompiles for exec. Plans / Hash joins / Aggregate functions / Data sorting / Disk I/O activity / Other apps/services / Screen savers

Demo time:
He showed creating tables and dropping inside a loop and outside a loop. The recompiles in the second scenerio was (as expected) better.

Third: Memory
These are the counters:
Available MB > 10Mb
Pages/Sec < 50-300 (depending on disk perf)
Buffer cache hit ratio >= 90%
Free pages > 640
Memory grants pending = 0 (quite important!)
Target server memory - physical memory
Total server memory - target memory

Things you should take in count to resolve the memory bottleneck:
- Set to dynamically allocate or raise max
- Increase physical RAM
- Evaluate high read count queries
- /3GB switch in boot.ini (!)
- /PAE switch in boot.ini + AWE enable in SQL (!)

He didn't go into the network, because there should be no problem these days thanks to the current technology.

To conclude, there are also a number of SQL Specific counters. To investigate missing indexes: use the "Full Scan/sec". To investigat blocking, you can use:
- Total latch wait time (ms)
- Lock timeouts/sec
- Lock ait time (ms)
- Number of deadlocks/sec

It was an interesting session, and you should download the presentation if it comes available on Mibuso ... .

DenSter · 2007-03-24

This is another very useful session, and I really hope that the slides will become available. It was a very 'geeky' topic (they really amped it up on the geekiness

), so some of it may not make sense to everyone. I also don't know if all of the counters matter in an NAV implementation, but it gives you a direction of what to look for, so you can tell if it is memory, CPU, disks, etc.

Marije_Brummel · 2007-03-24

The most usefull counter for NAV is diskactivity!! Match this with the reads in Profiler and you have your source of many problems. \:D/

Of course, before you reach this level of performance tuning you first need to add a maintenance plan and tune SIFT and Indexes.

DenSter · 2007-03-25

Well yes and no....

If you find a customer with RAID 5 on a 10 Mbit line using VPN to connect to the database directly, there are more obvious things to do than tuning indexes. Our approach is to verify that the hardware is ok, and tackle that first if it is problematic. Once there are still performance problems on 'good hardware' then you start the other things.

Obviously you can do some of the obvious checks while you're reviewing the hardware, such as the checking the largest tables, take a quick scan of the (SIFT) indexes. If the largest tables are the ledger entry dimension entry tables and the analysis view tables for instance, you have a pretty good candidate for quick wins.

Waldo · 2007-03-25

I always first look at the hardware / infrastructural configuration ... if it "should" be OK, I start looking further at the interesting stuff

kriki · 2007-03-26

Waldo wrote:

I always first look at the hardware / infrastructural configuration ... if it "should" be OK, I start looking further at the interesting stuff

I also use this approach. But sometimes the customer doesn't want to change the hardware...

Waldo · 2007-03-26

Yes, but then at least, the customer notices your disapproval ... knows when something is not right ... so your butt is covered

DenSter · 2007-03-26

Plus you get more billable hours out of a situation like that, until the customer starts to refuse payment

Howdy, Stranger!

Quick Links

Categories

Convergence: SQL Server and Win. Perf. Counters That Matters

Comments