The "tell me what happened" or the responsibility gap

Miklos_HollenderMiklos_Hollender Member Posts: 1,597
edited 2014-06-02 in General Chat
How do you handle these situations:

User: I got this weird result. What happened?

Programmer/Consultant / IT expert: I can't tell after the fact what happened back then. Unless you can give me a way to reproduce the weird behavior of the software I can do nothing.

User: No, either tell me what I did wrong or what I must do differently, or else I assume I am doing my job right, and therefore the system is wrong and must be fixecd.

Programmer / Consultant / IT person: no, my job is to fix reproducible errors.

But what happens when you simply got wrong results, but it is not a reproducible problem nor an identifiable user error? Whose responsibiliy it is and who and how fixes it?

E.g. we have a case when a foreign currency purchase invoice has the vendor, VAT etc. booked correctly i.e. at say $100 converted to €72, but the inventory value is booked as €100 and the difference somehow became an indirect cost. Not a reproducible program error, not an identifyable user error!

At least half of the support requests I deal with is "tell me what happened". Which is very often not possible.

Comments

  • davmac1davmac1 Member Posts: 1,279
    I had a similar situation where a web function I wrote for NAV 2013 did not work properly. When I ran it thru the debugger it worked perfectly.
    It turned out the function I called to set initial values lost its settings when run normally without the debugger running.
    In your cases, one way of approaching it is to determine what can go wrong to cause the bad result.
  • Alex_ChowAlex_Chow Member Posts: 5,063
    At least half of the support requests I deal with is "tell me what happened". Which is very often not possible.

    That's because you're not trying hard enough.

    There is always a proper reply to the end users to explain what happened.
  • geordiegeordie Member Posts: 655
    Happened to me a couple of times, luckily I always had to face reasonable users and I was to agree with them a telephonic/remote assistance the next time they had to perform the same task.
    Often it came out they were setting some parameters in a wrong way, one time was a process bug.

    Remember to ask EVERY single data they insert and, to be sure, the order: in an occasion, with a hardly customized sales line table, behavior was different switching the order of filling two fields with the same value.

    Just my 2 cents.
  • rmv_RUrmv_RU Member Posts: 119
    There are many reasons why user can get unexpected result. In my experience that the most unpleasant - phantoms in the database. Usually they can not be reproduced.
    Looking for part-time work.
    Nav, T-SQL.
  • einsTeIn.NETeinsTeIn.NET BochumMember Posts: 1,041
    We never had non-repeatable issues that were just caused by the database system. There was always an additional Windows, network, third party product, user, ... what ever issue.

    I know that some issues are very hard to find and sometimes it's not possible to proof users wrong when they assert that they did (or did not) do something. But in the end you'll find an explanation or at least an idea of what happened. Sometimes the user won't trust you when you have just an idea, that's true. But I think it's not worth it to chase issues for ever or discuss it with unreasonable users when the issue is not repeatable and the system works as expected when you do it again.
    "Money is likewise the greatest chance and the greatest scourge of mankind."
  • rmv_RUrmv_RU Member Posts: 119
    We never had non-repeatable issues that were just caused by the database system. There was always an additional Windows, network, third party product, user, ... what ever issue.
    I mean uncommitted data or data modified by another process. Two examples from customized solutions:
    Processing external wev orders:
    Wrong code, which caused double sales orders usually once or twice per month:
    WebOrder.get(OrderID);
    WebOrder.testfield(Status, Status::Created);
    WebOrderMgt.CheckWebOrder(WebOrder.ID);
    WebOrderMgt.CreateSalesOrder(WebOrder.ID);
    
    Correct code, which caused error on testfield once and then some functionality has been rewritten.
    WebOrder.locktable;
    WebOrder.get(OrderID);
    WebOrder.testfield(Status, Status::Created);
    WebOrderMgt.CheckWebOrder(WebOrder.ID);
    WebOrderMgt.CreateSalesOrder(WebOrder.ID);
    

    Unposting functionality - deleting item ledger entries, application entries and recalculation remaining quantity. Missed locktable (!!!) caused incorrect remaining quantity and open sign once or twice per quarter.
      ItemAppEntry1.RESET;
      ItemAppEntry1.SETCURRENTKEY("Outbound Item Entry No.");
      ItemAppEntry1.setrangge("Item Ledger Entry No.", "Item Ledger Entry No.");
      IF ItemAppEntry1.FIND('-') THEN BEGIN
        REPEAT
          ItemAppEntry1Temp:=ItemAppEntry1;
          IF ItemAppEntry1Temp.INSERT(FALSE) THEN;
          ILE1.RESET;
          ILE1.LOCKTABLE;//!!!
          ILE1.GET(ItemAppEntry1."Inbound Item Entry No.");
          IF NOT ILETemp.GET(ILE1."Entry No.") THEN BEGIN
            ILETemp:=ILE1;
            ILETemp."Remaining Quantity" := ILETemp."Remaining Quantity" - ItemAppEntry1.Quantity;
            ILETemp.Open := (ILETemp."Remaining Quantity" <> 0);
            IF ILETemp.INSERT(FALSE) THEN;
          END ELSE BEGIN
            ILETemp."Remaining Quantity" := ILETemp."Remaining Quantity" - ItemAppEntry1.Quantity;
            ILETemp.Open := (ILETemp."Remaining Quantity" <> 0);
            ILETemp.MODIFY;
          END;
        UNTIL ItemAppEntry1.NEXT=0;
      END;
    
    Looking for part-time work.
    Nav, T-SQL.
  • einsTeIn.NETeinsTeIn.NET BochumMember Posts: 1,041
    I don't know what you want so say with that, but when you have incorrect code in your solution then it's always a repeatable issue. Of cource, some issue might be hard to repeat as you need exactly the same data to reproduce it. But that's a question of how much time would you spend to investigate instead of leave it as it is.
    "Money is likewise the greatest chance and the greatest scourge of mankind."
  • rmv_RUrmv_RU Member Posts: 119
    I don't know what you want so say with that, but when you have incorrect code in your solution then it's always a repeatable issue. Of cource, some issue might be hard to repeat as you need exactly the same data to reproduce it. But that's a question of how much time would you spend to investigate instead of leave it as it is.
    Maybe you're right, I just want to note that there is a category of errors that occur periodically in multi-user activity, which is very difficult to reproduce. You are very lucky if you did not come across them.
    Looking for part-time work.
    Nav, T-SQL.
  • jglathejglathe Member Posts: 639
    rmv_RU wrote:
    You are very lucky if you did not come across them.
    I can only second this. Tracking down concurrency problems is often the hard part.
  • einsTeIn.NETeinsTeIn.NET BochumMember Posts: 1,041
    rmv_RU wrote:
    I don't know what you want so say with that, but when you have incorrect code in your solution then it's always a repeatable issue. Of cource, some issue might be hard to repeat as you need exactly the same data to reproduce it. But that's a question of how much time would you spend to investigate instead of leave it as it is.
    Maybe you're right, I just want to note that there is a category of errors that occur periodically in multi-user activity, which is very difficult to reproduce. You are very lucky if you did not come across them.
    Ah, I know what you mean. That's one of the issues I meant when I said some are very hard to repeat. Of course you can create an extra test environment where e.g. all the processes are slowed down to simulate simultaneous input of several users. But that's no environment that you would typically hold available. So, you would have enormous effort to reproduce the issue. That's why I said it's always a question of how much time you want to spend on certain issues.
    "Money is likewise the greatest chance and the greatest scourge of mankind."
  • geordiegeordie Member Posts: 655
    Maybe you're right, I just want to note that there is a category of errors that occur periodically in multi-user activity, which is very difficult to reproduce. You are very lucky if you did not come across them.

    Semi-OT: is there any best practise or tool to conduct these types of tests?
  • rmv_RUrmv_RU Member Posts: 119
    geordie wrote:
    Semi-OT: is there any best practise or tool to conduct these types of tests?
    In my opinion there is a production database :).

    When this kind of error happens, i usually make this steps:
    1. Make sure that the record has not been changed since the error occurs.
    2. Get a timestamp of the record as a bigint using following SQL code (replace company_name, table_name, prlmary_key with correct value):
    select cast(timestamp as bigint) from [company_name$table_name] where [primary key][email protected] key
    
    3. Get a nearest timestamp from log tables (cnange log entry, customized log entries).
    4. Compare users activity by using log tables.
    5. Revise the code.
    Looking for part-time work.
    Nav, T-SQL.
  • mdPartnerNLmdPartnerNL Member Posts: 801
    rmv_RU wrote:
    geordie wrote:
    Semi-OT: is there any best practise or tool to conduct these types of tests?
    In my opinion there is a production database :).

    When this kind of error happens, i usually make this steps:
    1. Make sure that the record has not been changed since the error occurs.
    2. Get a timestamp of the record as a bigint using following SQL code (replace company_name, table_name, prlmary_key with correct value):
    select cast(timestamp as bigint) from [company_name$table_name] where [primary key][email protected] key
    
    3. Get a nearest timestamp from log tables (cnange log entry, customized log entries).
    4. Compare users activity by using log tables.
    5. Revise the code.

    About 3 and 5, could you explain or give an example of this? It's very interesting topic I think :)
  • rmv_RUrmv_RU Member Posts: 119
    edited 2013-12-19
    About 3 and 5, could you explain or give an example of this? It's very interesting topic I think :)
    declare @timestamp bigint
    select @timestamp =cast(timestamp as bigint) from [company_name$table_name] where [primary key][email protected] key 
    declare @entry_no int
    select top 1 @entry_no =[Entry No_] from [company_name$change log entry] where cast(timestamp as bigint)>@timestamp order by cast(timestamp as bigint)
    select top 2000 @timestamp as original_timestamp, 
    	cast(timestamp as bigint) as [timestamp], 
    	cast(timestamp as bigint) - @timestamp as offset
    , * from [major terminal$change log entry] where [Entry No_] between @entry_no - 1000 and @entry_no + 1000 
    order by [entry no_]
    
    I can't give exact instructions to analysis log entries, it depends on error and logging policy. In my practice i prefer log everything.
    General recommendations:
    1. Identify a user who modified the record. Usually he has a least offset from original timestamp in change log entry.
    2. Find a least offset with another user id. Probably he competed for the record with first user.
    3. Analyse activity of both users - what other records were logged, ask users and try to find incorrect piece of code.
    Looking for part-time work.
    Nav, T-SQL.
  • mdPartnerNLmdPartnerNL Member Posts: 801
    rmv_RU wrote:
    About 3 and 5, could you explain or give an example of this? It's very interesting topic I think :)
    I can't give exact instructions to analysis log entries, it depends on error and logging policy. In my practice i prefer log everything.
    General recommendations:
    1. Identify a user who modified the record. Usually he has a least offset from original timestamp in change log entry.
    2. Find a least offset with another user id. Probably he competed for the record with first user.
    3. Analyse activity of both users - what other records were logged, ask users and try to find incorrect piece of code.

    ok, so you are talking about the NAV log entry.. I was thinking the SQL Log as this is always active.

    The Log in NAV is not always on but when you have lock/support problems and then setting this to on this will work of course. Thx.
  • David_SingletonDavid_Singleton Member Posts: 5,475
    Alex Chow wrote:
    At least half of the support requests I deal with is "tell me what happened". Which is very often not possible.

    That's because you're not trying hard enough.

    There is always a proper reply to the end users to explain what happened.

    :thumbsup:
    David Singleton
  • Miklos_HollenderMiklos_Hollender Member Posts: 1,597
    Why would be always? Perhaps in customizations / add-ons yes, as generally they are simple and overseeable and you understand the purpose of each piece of it if you made it.

    However, the standard, which is largely a huge jumble of overly complicated stuff for purposes that could be solved much easier, and there is no documentation ever explaining you what each piece of code meant to achieve, and if you look into the next version you see that codeunit completely rewritten because it was probably buggy as hell, I think there are untrackable situations. I have found them e.g. as orphaned reservations entries, purchase prepayments that were deducted in a wrong way, completely crazy results out of inventory adjustment, over-the-top indirect costs and so on. I don't think it is possible to track them down if they are not reproducible because it involves understanding the purpose of large chunks of code that don't make a lot of sense and when you look into the next version and see they are completely rewritten then yeah, never meant a lot of sense to begin with.
  • DenSterDenSter Member Posts: 8,281
    All I can say about that is that some people are just not very good at figuring things out. The easy way out is to point the finger at "code that doesn't make sense". Not saying that some of the code is not unnecessarily complicated, because it definitely is, but like Alex says there is ALWAYS an explanation, it just takes persistence to get to the bottom of it. Some people have a lot of persistence, some people don't.

    I don't mean this in any disrespectful way, because I've given up on plenty of bug-hunts, but it really feels like a cop-out to just say "I can't explain this because it is crap code". I usually explain it as "I could spend more time on trying to figure this out, do you want me to continue?" and then it's the customer's choice to spend more money to keep digging.
  • Miklos_HollenderMiklos_Hollender Member Posts: 1,597
    "and then it's the customer's choice to spend more money to keep digging."

    Sure, as if anyone would be actually willing to pay for it. Wait, what?

    Seriously DenSter somehow you and me and some other folks around here sound like working on an entirely different planet than me - customers who seriously accept the time based billing concept in everything. I mean even I wouldn't in private life, so I don't expect anyone else to do it - I take only fixed quotes from my car mechanic, not estimates because I have all the power as a customer and they have zero, because it is 99% a buyers market: so I can force them to take all the inherent risks of estimation. And I know, working at an end user site, that I could never sell any project to my boss if it was not 200% fixed price and with warranty. This is my planet. On yours customers just accept anything getting billed even if you see it as you selling them an imperfect software that does unexpected things. Astonishing. (Even now working internally and thus with more trust than external consultants, I still get the mood that anything unexpected happens is somehow my fault.) Is your planet such a sellers market? If yes, how comes you guys are forced to have infinite amounts of patience and dedicated?

    In the meantime, I somehow live on an entirely different planet - such as I am currently migrating a customization by a Danish!!! partner with a fairly good sounding name that has so low standards that it has chunks of Danish texts hardcoded right in the middle of the code. And only gets worse from here. On my planet my level of dedication to quality is exceptional - I don't think I have ever tried to bill for work that did actually not solve a problem - and yet your standards come accross as infinitely high to me.

    I live on a planet with suspicious clients and almost everybody doing worse work than me. Somehow your planet, yours and some others folks' planet around here just does not work the same way as mine, it is about infinitely flexible clients and infinitely high quality consultants.

    I mean I kind of know this sort of your planet exists - it is the planet of seriously big business. Like, banks. Insurance companies. The kind of businesses that don't even have a single owner who decides everything, they usually have shareholders. You know places that have actual processes, not just the whims of the owner. These kind of really corporatey places that have stuff like actual HR and similar corporatey stuff. The only thing I don't quite understand is what does NAV even do in that kind of world? I mean that is typically SAP turf.
  • matttraxmatttrax Member Posts: 2,309
    Maybe we do because I tell customers the same thing that Daniel does. I worked for an end user for four years, so I understand your perspective. Of course you want a fixed price and a warranty, I don't blame you, but think about it from a software vendor's point of view:

    We write an add-on for a CRONUS database because we have to have some sort of baseline. You want a fixed price to implement that add-on in your customized database, essentially a different baseline. In fact, a baseline we have never seen before. You want a warranty, but will continue to do your own development to the database that may cause instabilities with the add-on. We have no control over this. And you want a fixed price for support, even though you have changed the software after it was implemented. On what planet could a company come up with an accurate fixed price estimate for this? And when they can't why would they ever risk coming in too low?
    I don't think I have ever tried to bill for work that did actually not solve a problem - and yet your standards come accross as infinitely high to me.
    That sounds like you would want to bill on time spent, the opposite of a fixed price. If the problem is solved in 1/4 of the time you are still paying for the other 3/4, regardless if work is being done.

    Your mechanic example is the perfect counter to your argument oddly enough. In the US mechanics do typically have a fixed price for every little thing they do. It's an industry standard. But they are so efficient that they always come in under the specified amount of time. You just end up paying more money than time and expense would have cost you.

    As for some of your smaller points, I'm afraid they are not unique.
    partner with a fairly good sounding name that has so low standards that it has chunks of Danish texts hardcoded right in the middle of the code
    ...
    I still get the mood that anything unexpected happens is somehow my fault.
    ...
    and almost everybody doing worse work than me.

    There's bad development everywhere, and that doesn't mean just code. Have a read of The Design of Everyday Things by Donald Norman. And every company blames something or someone else, technology is just the easiest target because fewer people understand it. It's part of every job and every aspect of life.

    If you ask me it is your standards that come off infinitely high, but the point I always try to make in these lengthy posts is that it's all about perspective. End User perspective vs. VAR / Partner perspective. Mine vs. Yours. Small company vs. Big Company. Perspectives based on our parts of the world, experience levels, training, position, and on and on. And to me that's the whole point of these posts and the community at large: understanding each other and using those perspectives to improve ourselves.
  • DenSterDenSter Member Posts: 8,281
    I live on a planet with suspicious clients and almost everybody doing worse work than me.
    :-k as my daughter would say... mmK
Sign In or Register to comment.