Buckets Effectiveness

Belias · 2009-09-14

Hi everyone, i am taking a look at SIFT buckets optimization, and i would like to unmanage some of them in order to reduce db size and increase write performance (oh, well...i think anyone knows why i want to do it

)
I recently have read an official MS manual about optimization, and it is written that you can proceed with buckets tuning like this: disable each bucket except the one "right before the last" ( :oops: sorry 4 this english).
then try to activate one more index above if you see that performance is not good, and so on...

My question is: how do i know if a bucket is performing or i can disable it and let sql do the sum?
is there some tool to do it or should i monitor the whole system (multiuser, too!) for a while and retry?

My first thought was to apply just a bit of filters for that key (as less selective as possible, in order to emulate the worst case) and then do a calcsums.
if performances are odd, i'll activate another bucket and retry...and so on

David_Singleton · 2009-09-14

Yes there is a very good tool available for this, the SQL Perform Analysis tool does exactly what you are asking.

check their site http://www.sqlperform.com/html/perform_details.aspx?id=22

kine · 2009-09-15

You can select the levels based on your knowledge of the system. If you know which filters are users (and the code) using, you can select only the appropriate levels. And if you know selectivity of different fields, you know which levels you do not know because the lower level will have only few additional records and SQL could sum them instead maintaining the higher level. Higher level means bigger part of table locked when updating the records (Grand total means table lock each time).

Belias · 2009-09-15

Yes, i know the trade off of activated buckets, but unfortunately, they have not been analyzed for this database, and i've been working for this customer for a year or so while this customer is on nav since 2006 (i've only given a hand to other developers, so i don't know a lot of logic of the customized system

)...i hoped there was a "scientific" method to analyzed them, but knowledge of the system is required...obviously this customer has lock problems, this is why i am analyzing write performances, too.

kine · 2009-09-15

You know the way - cut all to minimum and add where you see that it is needed... ;-) it means that in meantime it will be slow, but after that it will be better...

ajhvdb · 2009-09-15

Can somebody provide me with an example or link how this speeds up the SQL db?

Belias · 2009-09-15

if you can access this
https://mbs.microsoft.com/partnersource/communities/training/trainingmaterials/student/course80156.htm
It's really interesting (chapter 5 includes buckets, too)

stryk · 2009-09-15

Another tool providing also a feature for SIFT (and VSIFT) analysis is the "NAV/SQL Performance Toolbox" (http://www.stryk.info/english/toolbox.html).

But for a start I recommend this proceeding:

1. Create a shortlist about all important tables: tables containing many records and maintaining lot of SIFT indexes.

With these tables:

2. Disable all higher SIFT Buckets. Actually you need just the one next to the PK level (Caution: a PK level aggregation is NO aggregation!).
For example, if a SIFT index creates 5 Buckets (0 to 4) then 4 is the PK level (and disbaled by default - don't change this), and 0 is the TOTAL level (which is disabled, too). Thus, Bucket 1, 2 and 3 might be enabled. You basically need just Bucket 3, so you could disable 1 and 2. With this optimization you'll get rid of Millions of records, hence, the SIFT tables will remarkably shrink and perfrom much better! What you achieve with this is actually the same aggregation level as provided with the new VSIFT Views - those are summing on the same level.
Just this optimization shoud give a remarkable performance boost - fater reading & writing, less blocks.

After implementing these changes:
3. Run SIFT Maintenance, deleting the "Zero-Sum-Records", e.g. http://www.mibuso.com/dlinfo.asp?FileID=812 Caution: some NAV versions/builds are affected by some bug where this maintenance could corrupt FlowFields!
4. Run some Index-Maintenance (Defragmentation)
5. Run a Statistic-Update

After all this your SIFT structures - the important ones - are 98% OK.
To further tune the system you could disable all SIFT indexes (MaintainSIFTIndex = FALSE) of tables which contain very few records or are so called "hot tables".

Check for problems:
6. Use the "SQL Server Profiler" to search for bad performing queries (e.g. Reads >= 1000), here especially SELECT SUM queries on SIFT tables. This could indicate some problem with SIFT aggregation. But: solving this does not necessarily mean you have to enable another Bucket for this affected SIFT index! Just if everything fails, you should enable the next one higher bucket.
7. Instead of adding Buckets you could usually boost SIFT (and also VSIFT) queries by adding specific SQL site indexes (covering or using "Included Columns"). Even though these will be pure SQL things, the performance gain is much better (especially using INCLUDE is to highlight) than maintaining/executing the SQL Trigger (used to update the Buckets).

All this above is relatively quick & easy, but giving you a tremendous success! A thorough "effectiveness analysis" is (much) more time-consuming - and usually telling you what you already knew from above.

Belias · 2009-09-15

EXAMPLE:
-TABLE: G/L Entry
-KEY: G/L Account No.,Business Unit Code,Global Dimension 1 Code,Global Dimension 2 Code,Close Income Statement Dim. ID,Posting Date
-SUMINDEX: Amount,Debit Amount,Credit Amount,Additional-Currency Amount,Add.-Currency Debit Amount,Add.-Currency Credit Amount
-MAINTAINED LEVEL: only the eight (i'm in nav4sp1)

I RUN THE CODE

gled.SETCURRENTKEY("G/L Account No.","Business Unit Code","Global Dimension 1 Code","Global Dimension 2 Code",
"Close Income Statement Dim. ID","Posting Date");
gled.SETRANGE("G/L Account No.",'017060001');
gled.CALCSUMS(Amount,"Debit Amount","Credit Amount","Additional-Currency Amount",
"Add.-Currency Debit Amount","Add.-Currency Credit Amount");

From sql profiler i have about 1400 reads with 219 cpu time (my laptop)

-if i enable the 7th level of the sift (up to posting date:Month), too, i obtain a 114 reads with only 19 CPU time

In my opinon, this is good enough, and i would not enable more SIFT buckets...am i correct? (i didn't already try covering IDX)

David_Singleton · 2009-09-15

What is the setcurrent key for? It concerns me.

David_Singleton · 2009-09-15

Belias wrote:
EXAMPLE:
-TABLE: G/L Entry
-KEY: G/L Account No.,Business Unit Code,Global Dimension 1 Code,Global Dimension 2 Code,Close Income Statement Dim. ID,Posting Date
-SUMINDEX: Amount,Debit Amount,Credit Amount,Additional-Currency Amount,Add.-Currency Debit Amount,Add.-Currency Credit Amount
-MAINTAINED LEVEL: only the eight (i'm in nav4sp1)

I RUN THE CODE
gled.SETCURRENTKEY("G/L Account No.","Business Unit Code","Global Dimension 1 Code","Global Dimension 2 Code",
"Close Income Statement Dim. ID","Posting Date");
gled.SETRANGE("G/L Account No.",'017060001');
gled.CALCSUMS(Amount,"Debit Amount","Credit Amount","Additional-Currency Amount",
"Add.-Currency Debit Amount","Add.-Currency Credit Amount");
From sql profiler i have about 1400 reads with 219 cpu time (my laptop)

-if i enable the 7th level of the sift (up to posting date:Month), too, i obtain a 114 reads with only 19 CPU time

In my opinon, this is good enough, and i would not enable more SIFT buckets...am i correct? (i didn't already try covering IDX)

Just looking at the code, I can say that it wont tell you anything. In fact the data from this could seriously damage the work you are doing if you trust it. Its good for playing and learnign, but please don't use this for tuning a clients live data.

Belias · 2009-09-15

No, for sure not...as i said...

From sql profiler i have about 1400 reads with 219 cpu time (my laptop)

I "REALLY" do things only when i know what i am doing...this is why i am asking someone more expert if i am doing something wrong...Of course i am trying to learn ON MY LAPTOP TEST database

Why doesn't the code tells me anything, in your opinion?isn't this the hardest query i can do for this SIFT?or am i missing something?

about this

David Singleton wrote:

What is the setcurrent key for? It concerns me.

well, i do the setcurrent because i don't want to receive

Microsoft Business Solutions-Navision
The sum of the values in the Amount field cannot be calculated because the current key does not contain all the fields being filtered.

You must select a key that contains all the fields in the filter. The order of the fields is unimportant.

Filters: G/L Account No.: 017060001
Table: G/L Entry
Key fields: Entry No.
OK

stryk · 2009-09-15

Well, in my opionion this is the correct way to tune the SIFT structure. Usuall level 8 alone should do. As you have got the evidence that some queries need some higher aggregation you should enable the next level, in this case 7.
So IMHO it is important to step-up this "aggregation ladder" step by step, from bottom to top.

Alternatively - instead of bucket 7 - you could try to add a SQL site index in SSMS (here a covering index):

USE [My_NAV_DB]
GO
CREATE NONCLUSTERED INDEX ssi_CovIdx ON dbo.[CRONUS 403$17$1]
([bucket], [f3], [f45], [f23], [f24], [f71], [f4], [s17], [s53], [s54], [s68], [s69], [s70])
GO

What's the query performance then?

stryk · 2009-09-15

David, could you please specify your concerns?

Belias · 2009-09-15

stryk wrote:
Well, in my opionion this is the correct way to tune the SIFT structure. Usuall level 8 alone should do. As you have got the evidence that some queries need some higher aggregation you should enable the next level, in this case 7.
So IMHO it is important to step-up this "aggregation ladder" step by step, from bottom to top.

Alternatively - instead of bucket 7 - you could try to add a SQL site index in SSMS (here a covering index):
USE [My_NAV_DB]
GO
CREATE NONCLUSTERED INDEX ssi_CovIdx ON dbo.[CRONUS 403$17$1]
([bucket], [f3], [f45], [f23], [f24], [f71], [f4], [s17], [s53], [s54], [s68], [s69], [s70])
GO
What's the query performance then?

Yes, I have just tried it, following your previous advice

i don't notice improvements, anyway...1400 reads, same cpu & duration (i also have done an "update statistics" on $17$0, if it is useful)

P.S.: probably a concern of david is: is the filtered value significant?
my answer would be yes, because filtered set is 500000 records on a total of 5464146 (about 10%)

David_Singleton · 2009-09-15

Belias wrote:

No, for sure not...as i said...

From sql profiler i have about 1400 reads with 219 cpu time (my laptop)

I "REALLY" do things only when i know what i am doing...this is why i am asking someone more expert if i am doing something wrong...Of course i am trying to learn ON MY LAPTOP TEST database

Why doesn't the code tells me anything, in your opinion?isn't this the hardest query i can do for this SIFT?or am i missing something?

about this

David Singleton wrote:

What is the setcurrent key for? It concerns me.

well, i do the setcurrent because i don't want to receive

Microsoft Business Solutions-Navision
The sum of the values in the Amount field cannot be calculated because the current key does not contain all the fields being filtered.

You must select a key that contains all the fields in the filter. The order of the fields is unimportant.

Filters: G/L Account No.: 017060001
Table: G/L Entry
Key fields: Entry No.
OK

i its just for learning then as I said no problems. But in the real word be very very carefull of specifically tuning one account like this. You really need to monitor lots of transactions and agregate the results. You don;t want to bias your interpretation of performance on one stat. I have seen that danger often.

As to the setkey, I guess I haven't used calcsums for a long time, becasue I thought in SQL it ignored the selectkey. Good to know. BTW does SQL actually use this key or does it pick the one it wants?

Belias · 2009-09-15

I wouldn't evaluate performance of a sift on one query only, because i'm not expert; instead i WOULD like to do it, but i would like an expert to confirm me if my method can be used for this purpose.
In other words, my question can be: "does this single query is significant, supposing that is the worst query that can be done?"

my thought is that G/L account no. is the first field of the key, and i would be in danger ONLY if a user would aggregate more G/L accounts, isn't it?
Is there a particular query that can take down my theory?

About the fact that sql uses the specified key or not, i don't know because i'm doing my first steps in sql profiler and i can't find "execution plan" column.

Theorically, sql will query the faster SIFT table for the same calculated field. BUT I'M NOT SURE WHAT IS THE SQL LOGIC TO CHOOSE THE TABLE, I LEAVE THE CLARIFICATION TO SOMEONE ELSE

EDIT:-no more "solved" attribute for now

stryk · 2009-09-15

Well, actually a SETCURRENTKEY has not a direct impact on the query which is fired - the related SELECT SUM statemet has no ORDER BY (that's what the KEy-Context is normally used for)
but if there are multiple SIFT Indexes providing the same/similar sum-data, maybe the Key-Context is used by C/SIDE to determine which SIFT table to query ... I'm not sure about this ...

The SIFT optimization I explained above is something which could (should!?) be implemented anyway, as this definitely will boost performance. In very few cases having just one bucket is not enough, then normally a second one is all you need.
(well, actually this is my experience from dozens of optimization projects so far)

If specific tuning is required should IMHO not depend on a single query. That's why I'm analyzing SQL Profiler Traces with my "NAV/SQL Performance Toolbox" which counts how often a specific problematic query was recorded.
Then I could decide if it was just a singular problem which I could ignore (maybe not), or if it is a recurring issue which needs a fix.

To see if/which SIFT table is queried, you don't neccessarily need the QEP - as recording the "Performance: Showplan XML" event would increase the TRC file remarkably.
Simply record the "SP:Stmt Completed" and "SP: RPC Completed" events to check out the statements.
In your specific case this query should be a SELECT SUM(...) on table ...$17$1
These numeric tables are the SIFT tables. If you see a SELECT SUM(...) on ...$G_L Entry the SIFT index isn't used - for whatever reason - and the sums are calculated directly from the G/L Entry table ...

DenSter · 2009-09-15

To put it in a few less words:
If you see SUM queries on table names that you recognize from the NAV table designer, it is NOT using SIFT.

This doesn't always mean it's bad though, it might perform wonderfully. You have to look at the actual queries, their durations, their number of reads, and make some kind of conclusion about it. Even with customers that use the system in the same way you're going to find different performance problems. Even if a problem is solved today, it might need a different approach next week.

Performance tuning is an ongoing task. You can eliminate some large problems once and for all, but you're always going to find issues to solve.

Belias · 2009-09-15

I was just analyzing the query as you suggested, and i am now understanding how does it work...
for simple
select sum() from $17$0 it's ok, but i run into this odd query of 19000 reads:

SELECT SUM("sum1"),SUM("sum2"),SUM("sum3"),SUM("sum4"),SUM("sum5"),SUM("sum6") FROM (SELECT  SUM("s17") AS "sum1", SUM("s53") AS "sum2", SUM("s54") AS "sum3", SUM("s68") AS "sum4", SUM("s69") AS "sum5", SUM("s70") AS "sum6" FROM "CRONUS"."dbo"."SIAD CZ$17$0" WITH (READUNCOMMITTED)  WHERE (bucket=@P1 AND ((f4>=@P2 AND f4<@P3))) UNION ALL SELECT  SUM("s17") AS "sum1", SUM("s53") AS "sum2", SUM("s54") AS "sum3", SUM("s68") AS "sum4", SUM("s69") AS "sum5", SUM("s70") AS "sum6" FROM "CRONUS"."dbo"."SIAD CZ$17$0" WITH (READUNCOMMITTED)  WHERE (bucket=@P4 AND (f4<@P5)) UNION ALL SELECT  SUM("s17") AS "sum1", SUM("s53") AS "sum2", SUM("s54") AS "sum3", SUM("s68") AS "sum4", SUM("s69") AS "sum5", SUM("s70") AS "sum6" FROM "CRONUS"."dbo"."SIAD CZ$17$0" WITH (READUNCOMMITTED)  WHERE (bucket=@P6 AND ((f4>=@P7 AND f4<=@P8)))) AS [Sums]

What the hell is this?i'm not so skilled in sql, and i have completely lost in round brackets and parameters :oops:
anyway, as it took 267 cpu and 212 duration, i tried to see what happens if i enable another bucket (posting date:year)
The query changed to this (cpu,duration and reads didn't change at all)...Now, i am not asking you to solve my problem, but it think this query on posting date, will ever be odd, because i added all the needed fields to the buckets.

SELECT SUM("sum1"),SUM("sum2"),SUM("sum3"),SUM("sum4"),SUM("sum5"),SUM("sum6") FROM (SELECT  SUM("s17") AS "sum1", SUM("s53") AS "sum2", SUM("s54") AS "sum3", SUM("s68") AS "sum4", SUM("s69") AS "sum5", SUM("s70") AS "sum6" FROM "CRONUS"."dbo"."SIAD CZ$17$0" WITH (READUNCOMMITTED)  WHERE (bucket=@P1 AND (f4<@P2)) UNION ALL SELECT  SUM("s17") AS "sum1", SUM("s53") AS "sum2", SUM("s54") AS "sum3", SUM("s68") AS "sum4", SUM("s69") AS "sum5", SUM("s70") AS "sum6" FROM "CRONUS"."dbo"."SIAD CZ$17$0" WITH (READUNCOMMITTED)  WHERE (bucket=@P3 AND ((f4>=@P4 AND f4<@P5))) UNION ALL SELECT  SUM("s17") AS "sum1", SUM("s53") AS "sum2", SUM("s54") AS "sum3", SUM("s68") AS "sum4", SUM("s69") AS "sum5", SUM("s70") AS "sum6" FROM "CRONUS"."dbo"."SIAD CZ$17$0" WITH (READUNCOMMITTED)  WHERE (bucket=@P6 AND ((f4>=@P7 AND f4<=@P8)))) AS [Sums]

P.S.: code in nav is this: as you can see, i applied only a filter on posting date...it shouldn't be easy for this to happen in a normal database, but i like to bang my head everywhere

glen.SETCURRENTKEY("G/L Account No.","Business Unit Code","Global Dimension 1 Code","Global Dimension 2 Code",
"Close Income Statement Dim. ID","Posting Date");
glen.SETRANGE("Posting Date",0D,090807D);
glen.CALCSUMS(Amount,"Debit Amount","Credit Amount","Additional-Currency Amount",
"Add.-Currency Debit Amount","Add.-Currency Credit Amount");

EDIT: stryk, you talked about flowfield corruption in some nav versions, can you kindly explain it more precisely please? I have nav 4sp1 build 21666
EDIT2: found it!

https://mbs.microsoft.com/knowledgebase/KBDisplay.aspx?WTNTZSMNWUKNTMMYLSVQUSPTNTNSMQPYSLNSWLOLVQLPTWPYXWSOTOLUPOXXUNPL
But how can i know if my version is affected?the build no. refer to SP3...and the "applies to" refers to microsoft dynamics NAV 4.0 in general :-k

stryk · 2009-09-15

Thus, this is actually the query (just added some line breaks):

SELECT SUM("sum1"),SUM("sum2"),SUM("sum3"),SUM("sum4"),SUM("sum5"),SUM("sum6") 
FROM (SELECT  SUM("s17") AS "sum1", SUM("s53") AS "sum2", SUM("s54") AS "sum3", SUM("s68") AS "sum4", SUM("s69") AS "sum5", SUM("s70") AS "sum6" 
FROM "CRONUS"."dbo"."SIAD CZ$17$0" WITH (READUNCOMMITTED)  
WHERE (bucket=@P1 AND (f4<@P2)) 
UNION ALL 
SELECT  SUM("s17") AS "sum1", SUM("s53") AS "sum2", SUM("s54") AS "sum3", SUM("s68") AS "sum4", SUM("s69") AS "sum5", SUM("s70") AS "sum6" 
FROM "CRONUS"."dbo"."SIAD CZ$17$0" WITH (READUNCOMMITTED)  
WHERE (bucket=@P3 AND ((f4>=@P4 AND f4<@P5))) 
UNION ALL 
SELECT  SUM("s17") AS "sum1", SUM("s53") AS "sum2", SUM("s54") AS "sum3", SUM("s68") AS "sum4", SUM("s69") AS "sum5", SUM("s70") AS "sum6" 
FROM "CRONUS"."dbo"."SIAD CZ$17$0" WITH (READUNCOMMITTED)  
WHERE (bucket=@P6 AND ((f4>=@P7 AND f4<=@P8)))) AS [Sums]

If you also record with SQL Profiler the "SP: RPC Completed" event, you would also see the values for all the @P1 etc. parameters, but currently this is not important.
What NAV does here is, it combines/sums 3 resultsets from the same SIFT table. I guess "f4" is the "Posting Date", so probably 3 different date-ranges are summed here.
Anyway ...
... fiddling with additional buckets might help, or - probably - not. $17$0 is related to the first SumIndex-Key in G/L Entry.

As mentioned before, with specific SQL indexes you might increase the perfromance; in this case I guess this one would help:

USE [CRONUS]
GO

CREATE INDEX ssi01 ON "SIAD CZ$17$0"
(bucket, f4)
INCLUDE
("s17", "s53", "s54", "s68", "s69", "s70")
GO

How does the query perform with this index?

Regarding the "FlowField Corruption": Depending on the NAV version, there are different bugs in the SQL Trigger code which are updating the SIFT. I skip the technical details here, but the result is, IF you run SIFT maintenance - deleting the "zero-sum-records" e.g. using a stored procedure or standard "Table Optimizer" - THEN it could happen that several remaining SIFT records are not properly updated. Hence, FlowFields which are displaying the SELECT SUM results from those affected SIFT records will show wrong values.
This happens not in all cases, but the risk is pretty high.

As far as I know, the affected Database Version No. are these:
NAV 4.0: all versions before 63 (63 is OK, requires SP3 Update 6 Build 25143 or higher)
NAV 5.0: all versions before 82 (82 and up are OK, requires Update 1.? Build ? or higher)
Older NAV versions are not affected. As with 5.0SP1 and higher SIFT was replaced by VSIFT, the problem cannot occur.
See also http://dynamicsuser.net/blogs/waldo/archive/2009/07/13/platform-updates-overview-3-70-b-nav2009-sp1-ctp2-updated.aspx
So I'm afraid your version is one of the buggy ones ... :bug:

Belias · 2009-09-16

what a shame...i've read that the workaround is to uncheck and then re-check the maintainSIFTindex...with save and recompile between them i suppose.
about the covering idx, i think that the sql side tuning for the table will be voided when saved, isn't it?anyway, tables are not modified so frequently, it's worth a try. I'll let you know the results, thanks for now.

Instead, about the choosing of the bucket, i realized that sql has a strange behaviour when choosing the SIFT to query (it's not strange, only unexpected for me).
I think SQL choose the best bucket between all the buckets of all the indexes of that table.
if i:
-setcurrentkey for the $17$0
-set a filter on g/l account no.
-do a calcsums

Nav will search if another bucket has an higher bucket level with "g/l account no." than the one in the $17$0 for the same sumindexfield. If so, it will choose it... :-k interesting...

Here's the result after the covidx: first of all, the real installation is on sql2000: if in the future i want to put these modifications on a "real" (test or production) database, i won't be able to use included columns, sorry for not telling you before.
Anyway, i've created a covidx (all fields of table) for every SIFT table of table 17...no results at all :-k

stryk · 2009-09-18

Yep, enabling/disabling the "MaintainSIFTIndex" will fix the "corruption" as the whole SIFT stuff is re-created.

I'm also not quite sure about how NAV decides which SIFT table to query ... :-k

Regarding this index: ups, sorry - I thought you were running SQL 2005 ... no INCLUDES with 2000 ... sorry :oops:

Have a nice weekend!
Jörg

Belias · 2009-09-21

thanks to everyone for the support:
as i have to upload modification on a slow (and suggested to frequent locks) production db, we are going to talk about this with the customer, and as i don't know how to test if disabling some buckets will not decrease the reading performance too much (unfortunately, i've not got a multiuser environment, moreover, my hardware -a laptop- is not significant for tests), i think i am going to ask them to try modifications for a couple of days, if the result is bad, we'll roll back the situation...
I know it's not a really fair way to work (i'm risking), but it' the only way in my opinion...
i'll let you know in a pair of weeks when we'll do it [-o<

kine · 2009-09-21

This is best to do on-site, disable, monitor, enable what you need, it is question of one or two days...

Belias · 2009-10-19

I've done the modification on the production DB (i just remembered to update this post).
What i've done:
- clustered idx bug corrected
- some indexes for big tables have been streamlined
- SIFT Tables recreated (i disabled the maintenance of all sifts, and then i've re-enabled it with less buckets active)
- DBCC DBCheck
- Maintain Statistics

Result:
- they have been working for a week with less blocks and increased performance
- DB size Decreased from 48 GB to 27 GB

Problems:
- they have enhanched security model (

), I didn't thought that recreating SIFT means that i have to resynch all permission, because it's like creating a new table...it has been a bad monday ](*,) ](*,)
- SQL have had to be warmed up berfore having users working well

Howdy, Stranger!

Quick Links

Categories

Buckets Effectiveness

Answers