I'm scratching my head on this one, so I welcome any and all feedback that help us toward a solution.
In early December, one of our larger clients moved NAV to a new server. The old server was Windows 2003 with SQL 2005. The new server is Windows 2008 R2 with SQL 2008 SP1. The NAV database is 5.0 SP1. We did a technical update to 2009 SP1 cleint when we moved to the new server.
During the first few weeks everything seemed to run great. There was a noticable performance improvement. Most noticably in the time to run the nightly adjust cost. But over the past week the adjust cost has gotten noticably slower. Adjust cost was running 15 to 20 minutes a night. Over the past week, it is running 1.5 to 2 hours.
Here's some random information:
Database Size: 300 GB (30% free)
NAV client: 6.00.29958
Server: Dell M905 blade
Processor: 4 Quad core AMD 3.10 GHZ
All disk are on an EMC SAN (FC connection)
The NAV server has dedicated physical disk in the SAN
Data is on a RAID 10 with 42 drives
Log is on a seperate RAID 10 with 8 drives
There are no bugs - only undocumented features.
0
Answers
Could you identify poor performing queries with SQL Profiler? Maybe you could apply RECOMPILE hints for the affected tables (or Plan Guides)?
Basically the new OPTIMIZE FOR UNKNOWN feature (SQL 2008 and NAV 2009) should avoid bad QEP, but probably could also cause trouble ... :-k
NAV/SQL Performance Optimization & Troubleshooting
STRYK System Improvement
The Blog - The Book - The Tool
I have not run SQL Profiler on it yet. I'm planning to set it up for tommorrow night's Adjust Cost run. Would you have any specific options I should include?
In the area of plan caching. wouldn't a server reboot have clear those plans and reverted back to original behavior? The client rebooted the system yesterday and reported no change in behavior. On another note, there is a copy of this database on the same server. The copy was mad at the time of the original install. Adjust cost still runs fine in that database. So that does tend to point me toward some new query plan issue.
I would not include the QEP as this would remarkably increase the size of the TRC.
A restart of the SQL Server would surly erase all cached QEP, but maybe you could also try this with DBCC FREEPROCCACHE (= deleteing the procedure cache) first (users don't have to disconnect etc.).
As you describe there's no issue with the second DB I'd also say it's indeed some caching prob - so maybe indeed Plan Guide/RECOMPILE could help ...
Good Luck for the tracing!
NAV/SQL Performance Optimization & Troubleshooting
STRYK System Improvement
The Blog - The Book - The Tool
Lets take a case where the customer purchased 5,000 of an item and then sold them in 1s. You would now have 5000 application entries and the key being chose may have been ok when there were one or two applications, but not now. And because SQL is remembering the execution plan, it then uses the bad plan for all the following applications as well.
if this is the issue, then creating a 100% matching key to the
and
To do this you will have to go into C/SIDE and make sure the Navision key matches the filters and then make sure that index exists in SQL. Creating this wont be a permanent solution, but at least it will help to isolate the issue.
Thanks for the response.
The "old database" is a copy of live from about 6 weeks ago. I would not expect its data distribution to be much different from live. This client is a fresh produce processor. Materials are received, consumed, and shipped in a very short time span. With the exception of packaging, materials are consumed to a limited set of production orders and shipped on a limited set of sales orders. While I see your point, I don't think it is the issue here. One time when we will see what you describe is when they have invoiced packaging purchases. As these purchases can be consumed over hundreds of production orders. But this will also be accompanied by a sizable spike if the number of value entries created by Adjust Cost that evening. That has not occurred over the past week.
What's strange about this is it has happened all of a sudden. For the first several weeks, after moving to the new server, Adjust COst ran 40% faster than the old server. But this past week the bottom has dropped out. Now it is taking 3 to 4 times as long.
but you said that it was well in the first weeks... :-k
if you can try a newer build without too many problems for users.......
"Never memorize what you can easily find in a book".....Or Mibuso
My Blog
Independent Consultant/Developer
blog: https://dynamicsuser.net/nav/b/ara3n
Well, if comes down to those two affected queries maybe you could apply a RECOMPILE hint for each?
For example:
NAV/SQL Performance Optimization & Troubleshooting
STRYK System Improvement
The Blog - The Book - The Tool
My suggestion would be so also report this problem to Microsoft as well with high priority.
You are running on the last NAV builds (that we know of) on SQL 2008 so you should have the compile for average running meaning the execution plan should not be 'wrong' unless there is a bug in SQL.
This should be a very interesting project for the guys in Verbaek en Redmond to look at. Remember that the SQL2005 issues we thought was a 'bug' for a long time ended up being a design issue.
PS: 42 drives for such a 'small' SQL database is massive!
the SAN is new. the 42 drives is for future growth. it has only taken 3 years to get to this size
This is part of the response I got from MS Support:
"...The problem I have is that with your technical upgrade of the NAV client to NAV 2009 SP1, this customer is not running in a supported environment since a technical upgrade is not supported with NAV 2009 SP1. This has been posted on the Download page for some time now...."
That's news to me and I can't find any statement to that effect on the download page. They have asked for some additional info, but it is mainly along the lines of "you must have changed something". But the initial response looks like they are lining up the "not our problem" response.
"Never memorize what you can easily find in a book".....Or Mibuso
My Blog
Actually this has been policy for some years now, but not enforced excet by some countries. There have been a few posts here recently discussing it and it seems they are now enforcing it globally now.
Why is this for just NA? Or this already applies in EU?
WOW that's interesting. Actually shouldn't you create a new thread for this? Its a very important topic.
viewtopic.php?f=23&t=39672
As for not supporting it, when NAV 2009 was postponed, Microsoft was telling all customers that in order to move to 2009 when it is released is to upgrade to 5.0 first, and code wise it will be the same so you are not upgrading it again.
I couldn't believe the lies they were circulating, and now they won't support exe upgrade even when it's the classic client.
Business must be really bad that they start doing this.
We had clients where Microsoft on tech support issues would suggest to do exe upgrade to solve performance issues.
I'm guessing you cannot downgrade now upgrading 300 gig db won't be easy time wise.
Just to let you know, I put this in place but have not seen any sizable difference. The investigation continues. Been dealing with other issues these past few days.
This is a client that we have worked with since 2003. We are very familiar with the nature of their data and how they work, since we are heavily involved in the design and implentation of their business processes. There have been no changes that would have a bearing on this issue.
While the volume of data has grown over the years, we are talking about comparing behavior of just a month ago. The data volume is fairly consistent within that timeframe.
Your question got me digging a bit deeper. While waiting on my morning coffee.
How does SQL 2008 determine the sample size for updating statistics? Does it differ from SQL 2005 and also with table size?
If I look at the value entry index that seems to be a problem, in the live database the sample set is barely 1% of the records. While the other database is 100% of the records. The other database has not had statistics updated since being moved to 2008.
Auto update and Auto create statistics are off. A nightly job is run to selectively update indexes and statistics. The job uses the following criteria.
Indexes with less then 10% fragmentation are skipped.
Indexes with fragmentation between 10% and 30% are reorganized and their statistics updated
Indexes with fragmentation over 30% are rebuild (which also does a statistics update)
RIS Plus, LLC
Moving to a full update stats is something I will consider. The nightly maintenance widow is small and getting smaller. The sudden increase in Adjust Cost is creating problems with that. Not sure a nightly full would be doable, but a few time a week might. I'll have to time one and see.
The statistics for the index were last updated about 2 weeks ago. Since they were updated that tells me the index does get fragmented. The table has ~53 million records and has had about 1 million added since the last statistics update.
While there may be an opening for discussion (always welcome feedback) of the update process, that job has been in place since the system was put on SQL 2005 over 3 years ago. This performance issues occurred sometime after moving to SQL 2008. My observation was more about the different sampel set size being used. I'm question whether the 2008 statistics are legitimate with such a small sample set.
They will be running adjust cost later today. I will run a full stats update prior and report the results.
The issue seems to be that the default Update Statistics statement was using such a small sample set on a large table. I feel that may have had as much to do with the problem as the statistics not being updated for a few weeks.
I'm going to switch to a full statistics update and see how everything goes over the next few weeks.
Thanks to everyone who replied. I'll update this in a few weeks on how it's holding up. Sometimes you get so use to looking at a certain tree that you need someone to remind you of the forest.
If you run a complete update stats every day it will complete more quickly over a few days. I wouldn't be surprised if in a few days yours completes in about 5-10 minutes.
RIS Plus, LLC