NST refuse new connetions, active one works

PhennoPhenno Posts: 628Member
edited 2015-03-18 in NAV Three Tier
Hello all.

We have a problem for which I cannot find the cause...

From time to time (quite often) we have a situation where NAV2013 R2 clients could not login to NAV while active connections on that tier are working ok. This is going on until NST is restarted. In period when clients could not create new connections there is no error events in event viewer except large number of information events on Security, event id 5152, WFP blocked packet on port 7046.

I've search on this event id and found that it is a part of windows filtering platform where Windows blocks packet to certain port even if that port is opened in firewall (or firewall down) for a reason that no service is listening on that port, thus preventing port scanning (this is called windows stealth mode, working separately of windows firewall). Since existing NAV connections are working properly, I can say that port 7046 has a listening service so I'm stuck, why would windows block this packet if port is opened and service is working.

I have checked Max concurrent calls and Max Concurrent connections setup on NST and it is set way beyond number of users.

Windows 2012 R2
NAV 2013 R2 with CU 10
Three instances in use, one client and two NAS.
No separate AV installed.

This happens from time to time, but usually when there's a lot of users already connected (50+).

Does anyone has idea what could be the cause for this to happen?

Comments

  • PhennoPhenno Posts: 628Member
    After some thorough analysis, I can discard network related, windows related or 3rd party sw issues. It seems that NAV is really refusing new connections for some reason.

    I've doublechecked max concurent calls (150), max connections (150), but what about other settings such as:
    - Operation timeout (MaxValue),
    - Idle Client Timeout (1h),
    - Reconnect period (10min),
    - SQL command timeout (30min),
    - Max orphaned connections (20)
  • vremeni4vremeni4 Posts: 290Member
    Hi,

    When users cannot connect to the NAV Server Service what error message do they get ?
    If the error message is permission related then something is not right with AD.
    if the error message is network related then something is wrong on the network level.

    Thanks,
  • PhennoPhenno Posts: 628Member
    vremeni4 wrote:
    Hi,

    When users cannot connect to the NAV Server Service what error message do they get ?
    If the error message is permission related then something is not right with AD.
    if the error message is network related then something is wrong on the network level.

    Thanks,
    They get message that NAV connection failed (after 20 seconds or so).
  • vremeni4vremeni4 Posts: 290Member
    Hi,

    This is is definitely a network issue. It is not NAV but windows server related.

    You can try to change WFP in the Active directory policies.
    The Administrator of the system should be able to help you with this issue.

    This KB may also be helpful
    http://support.microsoft.com/kb/2654852

    Installing the latest SP may be an option too.

    I hope this helps.
    Thanks.
  • PhennoPhenno Posts: 628Member
    Vremeni4, this is Windows 2012R2 so this KB does not apply to it.

    As for network, when server is refusing new connections, I cannot make connection from local client too, and in that case I do not see logged event from WFP (or maybe I missed it).

    Does local connextion to server goes through WFP?
  • Rikt-ItRikt-It Posts: 35Member
    hi,

    Check the used account in servicetier. Probably has the password experied.

    //Christer
    Regards
    Christer in Stockholm, Sweden
  • PhennoPhenno Posts: 628Member
    Rikt-It wrote:
    hi,

    Check the used account in servicetier. Probably has the password experied.

    //Christer
    Rikt-It,

    This occurs from time to time and is solvable by simple restart of service tier so I do not think that it is caused by wrong service credentials.
  • vremeni4vremeni4 Posts: 290Member
    Hi,

    I think WFP check all connection regardless whether it is localhsot or full IP.

    Did you try to disable WPF with group policy in Active directory ?

    Thanks.
  • PhennoPhenno Posts: 628Member
    vremeni4 wrote:
    Hi,

    I think WFP check all connection regardless whether it is localhsot or full IP.

    Did you try to disable WPF with group policy in Active directory ?

    Thanks.
    Vremeni, I do not think that WPF is cause by it self, rather consequence.

    I'm testing changed setup on max calls and connections parameters to MaxValue. Still getting wpf notifications occasionally, which are caused by poor network imho, but that's OK. I'm now waiting if it will block completely again...
  • PhennoPhenno Posts: 628Member
    Changed setup helped for a while but problem occurred again. nAV server stopped connecting new clients.

    I'm open to any new diagnostic idea...

    There is a difference, though, I did not see 5152 events this time.

    Checked firewall, it is OK.
    Checked connection from local client, netstat says established (hence not a network problem).
    Started second backup instance, connected normally to it, meaning nav and sql are communicating.

    Allowed connections set to maxvslue, allowed calls to max value, timeout set to one hour, orphaned connections to max value.

    Anyone had similar experience?
  • xStepaxStepa Posts: 31Member
    Hi,

    well, just the point to think about - I saw something similar with N2009 classic. After some time the SQL server started to refuse new connections, while active users were allowed to work.
    As I remember, there was a bug in NAS - there was an old testing NAS service with wrong credentials, which was only stopped (not disabled). After some time, they restarted the server and this service attempted to logg-in. But thru this bug, each unsuccessful attempt consumed 1 CAL - and after some time, it consumed everything, just the restart of SQL always helped ...
    Regards
    xStepa
  • PhennoPhenno Posts: 628Member
    Interesting to analyze since I do have two separate NAV instaces for job scheduler (one for LS Retail and standard NAV).

    Though, there are differences with NAV2013, NAS sessions are of type background and AFAIK not counted in licence. Furthermore, wouldn't I get License error in that case? I'm receiving only NAV could not establish connection to server.

    But I would probably need to inspect Active Session table, which is, to my knowledge, used for session count) at that moment of service deny, maybe there is some licence issue involved...
  • PhennoPhenno Posts: 628Member
    Well... Same thing happened again. I checked active sessions table, nothing suspicious over there. I've successfully connected to backup instance while main one was refusing new connections and only restart of main instance helped.

    I've checked cumulative updates for R2, nothing much with similar subject.

    I,ve checked performance counters, nothing suspicious.

    I've checked SQL stats for blocked processes or similar, nothing suspicious over there.

    Quest continues...

    How does NAV check for licenses?

    Could it be something with DD for LS retail which is installed and working on the same server?
  • dave_cdave_c Posts: 39Member
    One of our customers has the same issue every 6 months or so. We've logged a call with Microsoft but struggled to get the logs from Microsoft Network Monitor to debug the issue whilst everyone is screaming that they can't log into NAV.

    Did you ever get to the bottom of the issue? I will try and remember to update this thread if we do!
  • PhennoPhenno Posts: 628Member
    Dave,

    I've never found a solution for these but it is occurring rarely, lately. My conclusion was that it was due to network instability between clients and server, causing sessions to hangout, or something like that.
  • RemkoDRemkoD Posts: 4Member
    edited 2016-11-14
    Next time you've the issue again. Do you see any suspicious queries in the SQL Server Management Studio > Activity Monitor> Recent Expensive Queries?

    I'm currently investigating an issue related to the session event table auto cleanup. Once every 3 month (default NST value SessionEventTableRetainPeriod) the NST starts a cleanup function to remove all records from the Session Event table older than 3 month. Somehow at some customers environments multiple NST's start the cleanup task at the same time. Causing locking issues on the Session Event table. What causes that, you can guess it by now, new NAV sessions cannot be created.
    I've a call open at Microsoft for this. If you have a familiar issue I will let you know the progress on this.

    Edit: Since you're using only 3 NST's and around the 50 users this might not be the issue.
  • PhennoPhenno Posts: 628Member
    Remko, at least you gave me one more thing to lookup while instance is stuck, to check locked tables in that moment, though I do not think its due to Session Event table since it would lock all instances in the same time. In my case only one instance is locked (usually).

    Though I should definitively check for long lasting queries too...
  • jbrajbra Posts: 24Member
    I suddenly have this issue with 1 NST of 2 in our production environment. 2009R2 on Windows Server 2008r2.

    when it 'goes down' existing connections are fine. Most users cannot make new connections, but some (apparently) can.

    Restarting the service or server (either), temp resolves the problem.
Sign In or Register to comment.