For the TEMPDB database, create 1 data file per CPU.
Lining up the number of data files with CPU’s has scalability advantages for allocation intensive workloads.
It is recommended to have .25 to 1 data files (per filegroup) for each CPU on the host server.
This is especially true for TEMPDB where the recommendation is 1 data file per CPU.
Dual core counts as 2 CPUs; logical procs (hyperthreading) do not.
Comments
Eric Wauters
MVP - Microsoft Dynamics NAV
My blog
I've been told that in 2005 sql, tempdb is used more. specially for navision since it uses cursor.
If you have a correctly working database without this bug TEMPDB should not be used heavliy.
Right now I am hoping that sombody at Microsoft will finaly wake up and solves this issue for once and for all, basicaly it is the last major issue for a good running system on SQL.
Of course there is the wishlist like being able to use included columns from the NAV Key designer and having the possible to only select a few columns instead of the *, but that has all been put on the list.
Using different filegroups in SQL can be very dangerous because all indexes are by default put in one filegroup.
If you are going to split the filegroups you have to decide which index goes where. This can be very difficult to decide because of the characteristics of an ERP system.
In an ERP system indexes are used for so many different processes that it is very difficult do make such decisions and the result can be that the load per filegroup can be very uneven.
My recomendation would be to leave the striping to SQL which does a pretty good job with it.
Good luck.
that way i don't have to worry about indexes.
the recomendation is up to .25 per file group. so up to 4 cpu.
Also when creating multiple files there is only an advantage if each can be placed on its own disk array. Placing them on the same array defeats the purpose since they would be competing for the same resource. Also never create more files then you have CPUs
use master
go
Alter database tempdb modify file (name = tempdev, filename = 'L:\MSSQL\data\tempdb_db1.mdf',size = 60Mb,filegrowth = 20Mb);
go
Alter database tempdb add file (name = tempdev2, filename = 'L:\MSSQL\data\tempdb_db2.mdf',size = 60Mb,filegrowth = 20Mb),
( name = tempdev3, Filename = 'L:\MSSQL\data\tempdb_db3.mdf',size = 60Mb,filegrowth = 20Mb),
( name = tempdev4, Filename = 'L:\MSSQL\data\tempdb_db4.mdf',size = 60Mb,filegrowth = 20Mb),
( name = tempdev5, Filename = 'L:\MSSQL\data\tempdb_db5.mdf',size = 60Mb,filegrowth = 20Mb),
( name = tempdev6, Filename = 'L:\MSSQL\data\tempdb_db6.mdf',size = 60Mb,filegrowth = 20Mb),
( name = tempdev7, Filename = 'L:\MSSQL\data\tempdb_db7.mdf',size = 60Mb,filegrowth = 20Mb),
( name = tempdev8, Filename = 'L:\MSSQL\data\tempdb_db8.mdf',size = 60Mb,filegrowth = 20Mb);
go
You really only need to do this if you have do this on a busy oltp system
I attended a session on SQL 2008 recently. They were discussing VLDBs. At first the data sizes did not seem large, until I realized that they were talking Tera not Giga. :shock:
So applying principles for deploying a 100Tb database are very different to a 100Gb db.
No I'm not confusing the two. Objects are created within filegroups. Filegroups may contain 1 or more files. It is true that SQL will assign one disk process per file per CPU. But this is only an advantage if the performance bottleneck is the CPU. If the performance bottleneck is the disk subsystem then this will not help. In fact it may actually make things worse, since creating more disk processes will increase the i/o demand on the subsystem.
You cannot make blanket decisions without first understanding where the performance bottlenecks are in a given system.
It's always best to put any performance features in at the start of a project, trying to make changes a couple of years down the line can be very very painful.
Only in very rare cases with >100 GB database and high transaction volume have I seen any benefit in breaking up the NAV database into multiple file and filegroups and placing them on their own set of disks. Normally if you just add the disks to the primary RAID10 data partition you were going to use to break up the database you get the same if not better perfomance than spliting things up. Granted there are scenarios that will be different. Breaking up the NAV database into different files and file groups can be a painful and time consuming process as moving objects from file group to file group can be tricky.
Just my $.02 worth.
if Navision doesn't parallel scan then you wouldn't turn off parallelism, so if the advice is to turn off parallelism then Navision will be suffering from CXPACKET locks. If you're using STD edition of sql then the parallelism setting isn't too critical, mostly enterprise benefits most, especially 2005. As I said it's to do with throughput and decreasing latency on the system which not only relates to the actual user activity but also the underlying hardware, smaller systems can benefit too - it's all a matter of scale.
btw if you're going to use multiple files with sql 2000 you also have to enable a trace flag.
you cannot make that blanket assumption. You must understand the performance bottlenecks of a particular system before deciding on configuration changes like this. I completely disagree that this should be implemented as a generic configuration.
I see that tempdb can represent up to 33% of my total transactions ( dedicated server sql 2005 ) so tempdb is certainly being used. Tempdb is also used more within SQL 2005 than it was within SQL 2000.
I admit that I consider multiple files the same as I consider a number of other configuration options, it's quicker to defrag a number of smaller files than it is one large file though so it can enable you to manage ntfs fragmentation much more efficiently, just another point to ponder.
Just about everthing you have said is a little off in a NAV/SQL implementation. NAV is a client side cursor based app utilizing parameterization not a stored procedures so paralellism is not necessary for NAV as the NDBCS.DLL translates the C/AL code to single line TSQL statements none of which require paralell execution, and it does not use SQL temp tables and the only thing tempdb is really used for is sorting in NAV. I have actually run side by side comparrisons of seperating NAV tables to their own spindles and also just adding the new spindles to the existing RAID 10 array and the difference was negliable it all depends on the I/O patterns of the customer. I have only seen one case where it had helped and that was due to over zealous reporting. Breaking up the DB into the same number of files as cpu. Is an old SQL 6.5 thing, I have never seen this anywhere in any MSSQL documentation?
Parallel plans and parallelism has nothing to do with stored procedures and ad-hoc t-sql / application generated sql, especially that in the main starts select * is more likely to generate a parallel plan than a stored proc. There's lots of info on how the optimiser calculates it's plans and at what stage it decides to use a parallel plan on msdn and technet, as well as the various "inside" sql books.
I don't see much adverse parallelism on my server and what there is I expect to cure with optimal indexing - or a plan guide.
http://www.sqllion.com/2009/05/optimizi ... rver-2005/
See Paul Randall's blog post here: http://www.sqlskills.com/BLOGS/PAUL/cat ... stems.aspx
Also, regarding the tempdb usage in 2005. It is true it went up, but that is dependent upon your workload. This is mostly due to row versioning added. For NAV, you will see more tempdb usage with pre 5.0 clients since the SIFTs were maintained using triggers instead of indexed views. In SQL 2005, the inserted and deleted virtual tables used in triggers were changed to be implemented using row versioning instead of scanning the transaction log, so you'll see I/O patterns change. With NAV 5.0, these are implemented using indexed views instead, so you won't see this additional usage (though that is not to say that you won't experience any depending on what temporary objects it needs to create as part of this maintenance, it just won't be significantly greater between versions).
Aside from that, if you turn the READ COMMITTED SNAPSHOT isolation level option on for the database, you will also see increased tempdb activity.
Example query:
USE tempdb
SELECT database_id,
file_id,
io_stall_read_ms,
num_of_reads,
CAST(io_stall_read_ms/(1.0+num_of_reads) AS numeric(10,1)) AS [avg_read_stall_ms],
io_stall_write_ms,
num_of_writes,
CAST(io_stall_write_ms/(1.0+num_of_writes) AS numeric(10,1)) AS [avg_write_stall_ms],
io_stall_read_ms + io_stall_write_ms AS [io_stalls],
num_of_reads + num_of_writes AS [total_io],
CAST((io_stall_read_ms+io_stall_write_ms)/(1.0+num_of_reads + num_of_writes) AS numeric(10,1)) AS [avg_io_stall_ms]
FROM sys.dm_io_virtual_file_stats(DB_ID(),null)
ORDER BY file_id desc
GO
You can compare the results of this query with results of this same query on the user database.
High usage of tempdb can be caused by bad indexing. For example queries with an ORDER BY without an index that can be used. Sorting of big result sets can cause a high usage of tempdb.