SQL Server deadlocks not automatically

deblockerdeblocker Member Posts: 19
Running Navision 4, over MSSQL2K 30 users. We migrate our customer from the 3.60 to 4.0. Now, our customer suffer deadlocks 8-10 times in the week.
The SQL Server is not able to kill the blocking process and the deadlocks are not automatically resolved.
The blocking client freezes and the Navision error messages 1204 or 1205 never appears in this situation, however it will in other deadlock situations, which are automatically handled by MSSQL.
The only way to resolve the deadlock is to kill the session from Enterprise Manager, after identify the blocking process.
In the mean time, all other users are blocked and cannot longer work.
We never had this problem in the old 3.60 version. Shall we search in source code for this problem?
Has anyone knowledge where the locking order has been changed between Navision 3.60 and 4.00, or where can i find some documentation about this?
Table 309 No Series Line is always involved in the blocking process.
Optimization plan with index rebuild is done every night and we never had performance problem, other than this one now.
Tracing with the Client Monitor and looking into the Potential Deadlocks gave me no useful information, as long they are valid only for native database.
Is there anywhere some more detailed documentation how to read the Client Monitor information about SQL Index conflict?

Answers

  • Marije_BrummelMarije_Brummel Member, Moderators Design Patterns Posts: 4,262
    Search this forum about deadlocks. There is some info about it.

    You need to solve it via coding.

    There is a document about this in the downloads section.
  • deblockerdeblocker Member Posts: 19
    Hi Mark,
    many thanks for your answer.
    I think we have all the documentation that is available on this forum, but we are investigating since many weeks, without success.
    I think this is a problem of locking escalation, do you have any idea how to trace this with the Client Monitor?
    Any hints will be highly appreciated.
  • Marije_BrummelMarije_Brummel Member, Moderators Design Patterns Posts: 4,262
    No, I am sorry (and happy) to say not to have experience with solcing deadlock's :D

    But colleages and friends have :(

    Have you read Tuning Navision for better performance.pdf ?
  • deblockerdeblocker Member Posts: 19
    Yes, i do that (not just only one time...)
    In fact, many times resolve some deadlocks is the same thing as increasing the performance, but sometimes we have to do the opposite thing...
  • aohlfsenaohlfsen Member Posts: 30
    I have an issue like yours in session with Microsoft support right now.

    Only we experience it with 3.70.

    As i said to Microsoft maybe we can live with deadlocks to some extent and if not we can try to fix as many as possible using the normal ways and tools for doing this.

    What we can not live with is, when the deadlocks don't get handled in an appropriate manner i.e. the client hangs.

    The appropriate manner being, as is normally the case, that a message is sent to the client that a deadlock has occured.

    I consider the hung client scenario to be some kind of core error in SQL-server or Navision.

    I will let you know if I learn anything further from my case.
  • bbrownbbrown Member Posts: 3,268
    It would appear that your are not experiencing a deadlock, but rather a situation where one user is holding a a record(s) locked that other users are trying to access. Since the Navision Client does not set lock-timeout, they will continue to wait until the lock ends, either by the locking process finisshing or being terminated.

    You will need to resolve this issue by adjusting the processes that are causing the issue. Try using the SQL ClIent Monitor that comes with the performance toolkit to help isolate the issue.

    One thing to look at: Are you using any processes that insert either entire large documents are numbered journals as single transactions? We had a similar problem on a site where we were using Commerce Gateway to import large sales orders (~500 lines). Since they were inserted has a single transaction, it would hold the lock the Number Series Line until the entire order was processed. When other users would attempt to create an order through manually, they would experience a "system hang" when hitting F3 for a new order. 9The system was waiting for the Number Series Line Table to be unlocked.

    Our solutionwas to modify Navision to give Commerce Gateway its own number series for orders. This way it did not lock the record neded by the other users.
    There are no bugs - only undocumented features.
  • aohlfsenaohlfsen Member Posts: 30
    In our case we we use the sql server deadlock trace flags and we allways see a deadlock registration in the SQL-server log-file when we have the hung application scenario.
  • kinekine Member Posts: 12,562
    1) If you have no Deadlock message, it is not deadlock...
    2) How do you know which table is involved in the locking? Any other tables invlolved?
    3) What are the users doing in time of the locking (the locked users)?
    Kamil Sacek
    MVP - Dynamics NAV
    My BLOG
    NAVERTICA a.s.
  • DeSpDeSp Member Posts: 105
    Hm, so I am not alone with this issue - here is my story:
    http://www.mibuso.com/forum/viewtopic.php?t=7853&highlight=havinf+deadlocks

    Iam very interested in what Microsoft can say about this issue, so, aohlfsen, would you please post your Microsoft support session progress in this topic. As I know, they can't help anyhow - so you'll have to solve it by yourself.

    deblocker, as you've read the Tuning Navision for better performance.pdf you must be know how you should build your custom queries e.g. (you must SETCURRENTKEY before any database call and SETFILTER/SETRANGE, you must not left holes in the current key and so on) - it is very important for SQL Server. As for me, I have three docs: SQL Server Recource Kit.pdf, Tuning Navision for better perfromance.pdf and Performance troubleshooting guide.pdf - all of them contain very useful info about optimisations you should take to significantly increase your performance. I follow these recommendations and I notice that I don't have any hangs anymore - so I hope this is the cure. But I still want to know what is the cause leading Navision SQL Server Option to a total dead hang - is it a core error with using non optimal keys or something else?.

    I hope it'll help you anyhow.
    Best regards, DeSp
    Nil desperandum
  • kinekine Member Posts: 12,562
    It is mainly wrong SQL indexes (created for keys with MaintainSQLIndex = yes) with low selectivity (for example key beginning with Document type etc.) - when you insert or modify one record, you will "lock" whole page in the index for the same "Document type" and nobody is able to insert or modify another record on same page... same problem is for SIFT tables, where are the records locked for the bucket... and if you optimize this, the possibility of the locking is much lower...
    Kamil Sacek
    MVP - Dynamics NAV
    My BLOG
    NAVERTICA a.s.
  • DeSpDeSp Member Posts: 105
    Yes, locking is acceptable but hanging is not. Why does hanging occur? I saw clients hanging for hours! I don't think that bad key selectivity made this.
    Nil desperandum
  • kinekine Member Posts: 12,562
    It depend on the locking chain. If one client freeze for example because unstable OS, all others, which want to work with some locked resources, freezes too, because if you are waiting for lock release in Navision, Navision client freeze... client is waiting (it does not hang) it only does not refresh screen and does not process windows message queue... and if the head of the chain is freezed for example for 2 hours, others will be "freezed" for 2 ours too... In SP1 you can set lock Timeout. it means, if the lock is not released in this time, the Navision will timeout and show you some message about this...
    Kamil Sacek
    MVP - Dynamics NAV
    My BLOG
    NAVERTICA a.s.
  • deblockerdeblocker Member Posts: 19
    Hi all,
    a strange thing is that, in my case, this problem never occurs in the 3.60 version, only after the migration to the 4.00 version, so i'm looking now where the locking order is changed in Navision standard code between ver. 3.60 and 4.00.
    Someone already investigated about that?
  • DeSpDeSp Member Posts: 105
    Kine, I understand this, but I don't get how some query can freeze a client for hours? This client locks resources and he has not any open transactions (open tran=0, status=sleeping) - what is he processing then? Yes as he locked resources, other clients waited for them - it is clear. It can be an OS error or SQL Server error leading to a orphaned session - but I don't see any errors in their logs.
    Nil desperandum
  • kinekine Member Posts: 12,562
    DeSp wrote:
    This client locks resources and he has not any open transactions (open tran=0, status=sleeping) - what is he processing then?

    If it is the head of the chain, may be that it is the bug which was existed in .370A but was solved in 3.70B (and the 4.00 can include this bug too, because it is solved in Update 1 or SP1). This bug was that in some scenario, the Message keep the locks on records (if Message was called from codeunit called through IF codeunit.RUN then statement). But I have not tested it on NA 4.00. Or it is problem that is waiting for some lock release too... (it can be why the no. of transactions is 0 - no transaction started, only trying to lock).

    Deblocked, Desp: Are you using Update 1 for Navision 4.00? Or you are using SP1?
    Kamil Sacek
    MVP - Dynamics NAV
    My BLOG
    NAVERTICA a.s.
  • aohlfsenaohlfsen Member Posts: 30
    I think i may be able to reproduce something that looks like this problem.

    At least sometimes.

    Last test 2 out of 5 gave successfull reproduction:


    Make two codeunits

    Codeunit 1:

    txt.DELETEALL;
    COMMIT;

    IF CONFIRM('Deletion done') THEN;

    nulls:='0000000000';

    FOR i:=1 TO 200000 DO BEGIN
    txt.Code:='A'+COPYSTR(nulls,1,6-STRLEN(FORMAT(i)))+FORMAT(i);
    txt.INSERT;
    END;

    IF txt.FIND('-') THEN
    REPEAT

    UNTIL txt.NEXT=0;

    Codeunit 2:

    nulls:='0000000000';

    FOR i:=1 TO 200000 DO BEGIN
    txt.Code:='B'+COPYSTR(nulls,1,6-STRLEN(FORMAT(i)))+FORMAT(i);
    txt.INSERT;
    END;

    IF txt.FIND('+') THEN
    REPEAT

    UNTIL txt.NEXT(-1)=0;

    For both codeunits:

    Variable i is an integer
    Variable txt i "standard text" table
    Variable nulls is of type Text 10


    Procedure:

    In client 1 run codeunit 1

    When you see the message "deletion complete" press ok and immediately hereafter run codeunit 2 with client 2.

    After a while one of the clients will complete its execution, and a deadlock message may or may not have been presented to the user.

    The other client will often appear to be hung forever.

    If you use the deadlock trace flags on the SQL server you will notice from the SQL server log that a deadlock has been registered as having occured between the two clients on the "standard text" table.

    At least it does with me with 3.70A, 3.70B build 19868 and Navision 4 sp1 on SQL server 2005.

    Don't tell me that the code makes no sence. I know this. It was only made for reproduction purposes.

    Best regards
  • deblockerdeblocker Member Posts: 19
    Hi aohlfsen,
    with:
    At least it does with me with 3.70A, 3.70B build 19868 and Navision 4 sp1 on SQL server 2005.
    you mean that you tested also other versions, like 3.60 and 4.00?

    ...let me know

    best regards
  • aohlfsenaohlfsen Member Posts: 30
    Sorry haven't tried 4.0 and 3.6.

    It takes time to get all these test-environments up and running, but i suspect that the problem would be the same.

    Best regards
  • deblockerdeblocker Member Posts: 19
    Thu Mar 30, 2006 8:43 am DeSp wrote:
    >>We are experiencing the same problems with Navision 3.7A and SQL >>Server 2000 sp3.
    >>Could you please tell me if you found any solution.

    Hi DeSp,
    I'm sorry, I don't found any solution. Read Kine's postings and you will find the answer. Kine has a clear vision of the problem described here:
    its not a deadlock, its a "never solved lock". Resources (or more table) are blocked and never released from the Navision client. For example, in our blocking chain, five, sex or more clients are blocking each other, until we kill the blocking spid. Like Kine, I think that with such a problem the best solution is to migrate asap to Navision 4.0 SP1 where is possible to set the locking timeout. What you mean about?

    Have a nice day.
  • aohlfsenaohlfsen Member Posts: 30
    The stuff that I can reproduce with the previous example seems to be an ODBC driver issue.

    Microsoft is currently looking into this.

    Will let you know when i hear something interesting.

    SP1 will not resolve this issue, the frequency may however be different.

    B.R.
  • DeSpDeSp Member Posts: 105
    Thanks for your replies, deblocker and aohlfsen.

    All we could do here is only decreasing the frequency of these hangups - we optimized SIFT and key performance everywhere. But they still appear from time to time. The locking timeout feature is good, but in this project we can't upgrade our current Navision 3.7A installation to 4.0 SP1.
    We have SQL Server SP3 installed, maybe it would be better to install SP4? How do you think?

    aohlfsen, could you please post your Microsoft support session's progress for this issue here?

    Let's post our findings in this topic.

    Kindest regards.
    Nil desperandum
  • aohlfsenaohlfsen Member Posts: 30
    Microsoft seem to think that at least the problem i have with my test-code can be solved with the hotfix described in kb890470, and indeed this seems to be the case.

    If you use Windows 2003 SP1 for clients this hotfix will allready be included as part of MDAC SP2.

    We are on Windows 2003 without SP for other reasons and have applied the hotfix manually.

    Onfortunately the hotfix we have got doesn't seem to apply for XP SP2. Perhaps Microsoft has an XP edition ;-)

    So its actually an MDAC problem that needs to be solved it would seem.
  • DeSpDeSp Member Posts: 105
    That's really a good news.

    We are on WIndows 2000 and Windows 98 8-[ with the last service packs installed.
    Who should we contact to in order to obtain the hotfix for these OS?
    Is it necessary to apply MDAC 2.8 SP1 before applying this hotfix?
    Nil desperandum
  • Miklos_HollenderMiklos_Hollender Member Posts: 1,598
    OK, there is something I don't understand. If you are using XPSP2, MDAC 2.8 SP1 you already have. You can only use MDAC 2.8 SP2 with Windows Server 2003 SP1 but obviously that's not something you'd want to buy for each client. So what's the general way to go?
  • ovicashovicash Member Posts: 141
    OK, there is something I don't understand. If you are using XPSP2, MDAC 2.8 SP1 you already have. You can only use MDAC 2.8 SP2 with Windows Server 2003 SP1 but obviously that's not something you'd want to buy for each client. So what's the general way to go?

    from where din you get Mdac 2.8 sp2 ?

    there is no item in http://msdn.microsoft.com/data/ref/mdac/downloads/

    thx
    ovidiu

    Best Regards
Sign In or Register to comment.