NST refuse new connetions, active one works

PhennoPhenno Member Posts: 630
edited 2015-03-18 in NAV Three Tier
Hello all.

We have a problem for which I cannot find the cause...

From time to time (quite often) we have a situation where NAV2013 R2 clients could not login to NAV while active connections on that tier are working ok. This is going on until NST is restarted. In period when clients could not create new connections there is no error events in event viewer except large number of information events on Security, event id 5152, WFP blocked packet on port 7046.

I've search on this event id and found that it is a part of windows filtering platform where Windows blocks packet to certain port even if that port is opened in firewall (or firewall down) for a reason that no service is listening on that port, thus preventing port scanning (this is called windows stealth mode, working separately of windows firewall). Since existing NAV connections are working properly, I can say that port 7046 has a listening service so I'm stuck, why would windows block this packet if port is opened and service is working.

I have checked Max concurrent calls and Max Concurrent connections setup on NST and it is set way beyond number of users.

Windows 2012 R2
NAV 2013 R2 with CU 10
Three instances in use, one client and two NAS.
No separate AV installed.

This happens from time to time, but usually when there's a lot of users already connected (50+).

Does anyone has idea what could be the cause for this to happen?

Comments

  • PhennoPhenno Member Posts: 630
    After some thorough analysis, I can discard network related, windows related or 3rd party sw issues. It seems that NAV is really refusing new connections for some reason.

    I've doublechecked max concurent calls (150), max connections (150), but what about other settings such as:
    - Operation timeout (MaxValue),
    - Idle Client Timeout (1h),
    - Reconnect period (10min),
    - SQL command timeout (30min),
    - Max orphaned connections (20)
  • vremeni4vremeni4 Member Posts: 323
    Hi,

    When users cannot connect to the NAV Server Service what error message do they get ?
    If the error message is permission related then something is not right with AD.
    if the error message is network related then something is wrong on the network level.

    Thanks,
  • PhennoPhenno Member Posts: 630
    vremeni4 wrote:
    Hi,

    When users cannot connect to the NAV Server Service what error message do they get ?
    If the error message is permission related then something is not right with AD.
    if the error message is network related then something is wrong on the network level.

    Thanks,
    They get message that NAV connection failed (after 20 seconds or so).
  • vremeni4vremeni4 Member Posts: 323
    Hi,

    This is is definitely a network issue. It is not NAV but windows server related.

    You can try to change WFP in the Active directory policies.
    The Administrator of the system should be able to help you with this issue.

    This KB may also be helpful
    http://support.microsoft.com/kb/2654852

    Installing the latest SP may be an option too.

    I hope this helps.
    Thanks.
  • PhennoPhenno Member Posts: 630
    Vremeni4, this is Windows 2012R2 so this KB does not apply to it.

    As for network, when server is refusing new connections, I cannot make connection from local client too, and in that case I do not see logged event from WFP (or maybe I missed it).

    Does local connextion to server goes through WFP?
  • Rikt-ItRikt-It Member Posts: 37
    hi,

    Check the used account in servicetier. Probably has the password experied.

    //Christer
    Regards
    Christer in Stockholm, Sweden
  • PhennoPhenno Member Posts: 630
    Rikt-It wrote:
    hi,

    Check the used account in servicetier. Probably has the password experied.

    //Christer
    Rikt-It,

    This occurs from time to time and is solvable by simple restart of service tier so I do not think that it is caused by wrong service credentials.
  • vremeni4vremeni4 Member Posts: 323
    Hi,

    I think WFP check all connection regardless whether it is localhsot or full IP.

    Did you try to disable WPF with group policy in Active directory ?

    Thanks.
  • PhennoPhenno Member Posts: 630
    vremeni4 wrote:
    Hi,

    I think WFP check all connection regardless whether it is localhsot or full IP.

    Did you try to disable WPF with group policy in Active directory ?

    Thanks.
    Vremeni, I do not think that WPF is cause by it self, rather consequence.

    I'm testing changed setup on max calls and connections parameters to MaxValue. Still getting wpf notifications occasionally, which are caused by poor network imho, but that's OK. I'm now waiting if it will block completely again...
  • PhennoPhenno Member Posts: 630
    Changed setup helped for a while but problem occurred again. nAV server stopped connecting new clients.

    I'm open to any new diagnostic idea...

    There is a difference, though, I did not see 5152 events this time.

    Checked firewall, it is OK.
    Checked connection from local client, netstat says established (hence not a network problem).
    Started second backup instance, connected normally to it, meaning nav and sql are communicating.

    Allowed connections set to maxvslue, allowed calls to max value, timeout set to one hour, orphaned connections to max value.

    Anyone had similar experience?
  • xStepaxStepa Member Posts: 106
    Hi,

    well, just the point to think about - I saw something similar with N2009 classic. After some time the SQL server started to refuse new connections, while active users were allowed to work.
    As I remember, there was a bug in NAS - there was an old testing NAS service with wrong credentials, which was only stopped (not disabled). After some time, they restarted the server and this service attempted to logg-in. But thru this bug, each unsuccessful attempt consumed 1 CAL - and after some time, it consumed everything, just the restart of SQL always helped ...
    Regards
    xStepa
  • PhennoPhenno Member Posts: 630
    Interesting to analyze since I do have two separate NAV instaces for job scheduler (one for LS Retail and standard NAV).

    Though, there are differences with NAV2013, NAS sessions are of type background and AFAIK not counted in licence. Furthermore, wouldn't I get License error in that case? I'm receiving only NAV could not establish connection to server.

    But I would probably need to inspect Active Session table, which is, to my knowledge, used for session count) at that moment of service deny, maybe there is some licence issue involved...
  • PhennoPhenno Member Posts: 630
    Well... Same thing happened again. I checked active sessions table, nothing suspicious over there. I've successfully connected to backup instance while main one was refusing new connections and only restart of main instance helped.

    I've checked cumulative updates for R2, nothing much with similar subject.

    I,ve checked performance counters, nothing suspicious.

    I've checked SQL stats for blocked processes or similar, nothing suspicious over there.

    Quest continues...

    How does NAV check for licenses?

    Could it be something with DD for LS retail which is installed and working on the same server?
  • dave_cdave_c Member Posts: 45
    One of our customers has the same issue every 6 months or so. We've logged a call with Microsoft but struggled to get the logs from Microsoft Network Monitor to debug the issue whilst everyone is screaming that they can't log into NAV.

    Did you ever get to the bottom of the issue? I will try and remember to update this thread if we do!
  • PhennoPhenno Member Posts: 630
    Dave,

    I've never found a solution for these but it is occurring rarely, lately. My conclusion was that it was due to network instability between clients and server, causing sessions to hangout, or something like that.
  • RemkoDRemkoD Member Posts: 100
    edited 2016-11-14
    Next time you've the issue again. Do you see any suspicious queries in the SQL Server Management Studio > Activity Monitor> Recent Expensive Queries?

    I'm currently investigating an issue related to the session event table auto cleanup. Once every 3 month (default NST value SessionEventTableRetainPeriod) the NST starts a cleanup function to remove all records from the Session Event table older than 3 month. Somehow at some customers environments multiple NST's start the cleanup task at the same time. Causing locking issues on the Session Event table. What causes that, you can guess it by now, new NAV sessions cannot be created.
    I've a call open at Microsoft for this. If you have a familiar issue I will let you know the progress on this.

    Edit: Since you're using only 3 NST's and around the 50 users this might not be the issue.
  • PhennoPhenno Member Posts: 630
    Remko, at least you gave me one more thing to lookup while instance is stuck, to check locked tables in that moment, though I do not think its due to Session Event table since it would lock all instances in the same time. In my case only one instance is locked (usually).

    Though I should definitively check for long lasting queries too...
  • jbrajbra Member Posts: 32
    I suddenly have this issue with 1 NST of 2 in our production environment. 2009R2 on Windows Server 2008r2.

    when it 'goes down' existing connections are fine. Most users cannot make new connections, but some (apparently) can.

    Restarting the service or server (either), temp resolves the problem.
  • ekaruzekaruz Member Posts: 3
    Hello to all,
    have anyone found the solution for this problem ?

    I have the same situation with NAV2013R2 on Windows Server 2012R2.

    Two Services in one Database, and twice a day one specific service needs to be restarted to let users log in again to NAV.
  • KTA8KTA8 Member Posts: 388
    I've the same issue with 2015, I also have two instances to the same database. I've go to another NST version so a bug can't be. I could pass two weeks or three months between problems.
  • Slawek_GuzekSlawek_Guzek Member Posts: 1,690
    Are those NSTs on the same machine, or on different ones? Have you checked the memory usage?
    Slawek Guzek
    Dynamics NAV, MS SQL Server, Wherescape RED;
    PRINCE2 Practitioner - License GR657010572SG
    GDPR Certified Data Protection Officer - PECB License DPCDPO1025070-2018-03
  • KTA8KTA8 Member Posts: 388
    Are those NSTs on the same machine, or on different ones? Have you checked the memory usage?

    In my case, they're in the same machine and there are plenty of free memory less than 50% usage
  • jbrajbra Member Posts: 32
    I have a similar on NAV2018. NST PROD2 will not accept new connections. Existing connections are OK and PROD1 is fine. The NST are on the same box. After about 20 minutes PROD2 will accept new connections again, or I can restart the service and it will be okay. The environment is in Azure; traditional three tier, not BC.
  • RemkoDRemkoD Member Posts: 100
    I have investigated a few incidents like this on different environments (where one service tier does not except new connections while another service tiers on the same application service works fine). After restarting the service tier the NST can accept new connections again.

    In the cases where we've found a root cause it was related to a high load on the NST by either the quantity of users/sessions or by an application (read: NAV platform or external app) misbehaving causing a high load on the NST and/or SQL.

    Few tips that could help:
    • If many users are connected to one NST consider to load balance the load over multiple NST's/servers.
    • Try to use a dedicated NST for the NAV end-users. Move all other processes like the task scheduler or interfaces with other apps (through soap/odata web services for example) to other NSTs.
    • Upgrade the NAV platform to the latest cumulative update.
    • Instead of restarting the NST, kill the sessions one by one and check if the service gets up again.
Sign In or Register to comment.