How identify new records in any navision table?

obeliszs · 2012-01-21

Hi I'm working with SSIS and the data are daily updating. But I need to retrieve just new records from navision table (it can be any major table). Does there is some approach? How I undirstand the timestamp in navision table just identifying row?
Maybe there is some secret table with real unix timestamps and rows timestamps or something by which I can indentify new rows.

There are two ways:
1. select just new records from navision table
2. select all records and then do lookup...

First is the better, but is it possible?

ThomasHej_MSFT · 2012-01-22

I think we need a little more information around what you are trying to accomplish.

Getting “all new” (and only all new records) is not that easy. NAV offers a mechanism call Global Triggers that allows you to trap changes in real time to the database and – as you mention – SQL-timestamp could also be used if you write a query directly against SQL

The SQL timestamp is just a “counter” that is guaranteed to count upwards – Microsoft does not officially support getting any other information (e.g. date or time) from this field. It is supported to compare a timestamp field with timestamp fields from other records to find the “latest” one.

Finally SQL triggers could also be made to aid your scenario.

You will however need to realize, that all SQL timestamp mechanisms will yield all changed/inserted records since “last timestamp” but no info around deleted records.

Knowing which tables to “watch” will help you a lot, but we might provide more help if you could describe your scenario and why you need this…

Thomas

Finally - SQL Timestamp is being deprecated and is replaced by the more correctly describing name: RowVersion (but the concept is the same)

David_Singleton · 2012-01-22

For SSIS integration the simplest is just to create a new Navision field "Last Edited On" in the particular table.

obeliszs · 2012-01-22

Not always you can go inside navision tables and edit them:) big companys with their own policys...
So question about the timestamp in navision tables

The SQL timestamp is just a “counter” that is guaranteed to count upwards

I can convert it to integer and vice versa?

If so it probably worked out, to take the last integer (converted timestamp to integer ) in my table and then just select records upwards that integer from navision table...

ThomasHej_MSFT · 2012-01-23

Yes, you should be able to select the timestamp column.

All NAV tables are created with a timestamp column (named 'timestamp'), SQL does not create such a column automatically, but in NAV-land we always create that column on the tables.

The fieldtype of the timestamp column is a binary(8) so you will get results like '0x0000000054e3d85c' if you do a raw select.

Converting this to a 64 bit integer can be done by doing a

SELECT CONVERT(bigint,timestamp),<other fields> FROM <tablename>

Having that number you can now add a WHERE clause.

SELECT CONVERT(bigint,timestamp),<other fields> FROM <tablename> WHERE CONVERT(bigint,timestamp)&gt;12345

or simply

SELECT CONVERT(bigint,timestamp),<other fields> FROM <tablename> WHERE timestamp&gt;12345

As you can see SQL does support implicit conversion here, so you don't need the CONVERT in the WHERE clause (although it does not harm to put it there)

This will give you all records that has a numeric timestamp (version) greater than 12345. Notice, this will include all new rows as well as all changed rows. Put otherwise: The timestamp gets updated on all inserts and modifications.

Hope this will solve your scenario...

DenSter · 2012-01-23

Great information

I'm not sure about the binary timestamp. With this CONVERT statement, you essentially convert it to something humans can read, and I am wondering if it is necessary to do it in a query for ETL purposes.

Does the raw data sort properly? What I want to know is if we can use the raw field values to query, so that I don't have to worry about proper conversions.

I did the following two queries:

SELECT [timestamp] FROM [std2009R2$Cust_ Ledger Entry] WHERE
CONVERT(bigint,[timestamp]) <  130940

SELECT [timestamp] FROM [std2009R2$Cust_ Ledger Entry]
WHERE timestamp < 0x000000000001FF7C

This produced the same results, so I am guessing it doesn't matter if you convert the value or not. Will this work on large datasets too?

David_Singleton · 2012-01-23

DenSter wrote:
Great information

I'm not sure about the binary timestamp. With this CONVERT statement, you essentially convert it to something humans can read, and I am wondering if it is necessary to do it in a query for ETL purposes.

Does the raw data sort properly? What I want to know is if we can use the raw field values to query, so that I don't have to worry about proper conversions.

I did the following two queries:
SELECT [timestamp] FROM [std2009R2$Cust_ Ledger Entry] WHERE
CONVERT(bigint,[timestamp]) <  130940

SELECT [timestamp] FROM [std2009R2$Cust_ Ledger Entry]
WHERE timestamp < 0x000000000001FF7C
This produced the same results, so I am guessing it doesn't matter if you convert the value or not. Will this work on large datasets too?

I had the same question. If we can do it this way, and be sure it works it will make integration much simpler.

:thumbsup:

David_Singleton · 2012-01-23

ThomasHej_MSFT wrote:

Finally - SQL Timestamp is being deprecated and is replaced by the more correctly describing name: RowVersion (but the concept is the same)

Thomas,

thank you for this information. It's always good to know what's going on rather than guessing. Especially when a client is relying on it.

Is there any chance that RowVersion will be documented (aka supported) in future versions. For some clients it is hard (politically) to use a feature that is not supported.

obeliszs · 2012-01-23

ThomasHej_MSFT wrote:
Yes, you should be able to select the timestamp column.

All NAV tables are created with a timestamp column (named 'timestamp'), SQL does not create such a column automatically, but in NAV-land we always create that column on the tables.

The fieldtype of the timestamp column is a binary(8) so you will get results like '0x0000000054e3d85c' if you do a raw select.

Converting this to a 64 bit integer can be done by doing a
SELECT CONVERT(bigint,timestamp),<other fields> FROM <tablename>
Having that number you can now add a WHERE clause.
SELECT CONVERT(bigint,timestamp),<other fields> FROM <tablename> WHERE CONVERT(bigint,timestamp)&gt;12345

or simply

SELECT CONVERT(bigint,timestamp),<other fields> FROM <tablename> WHERE timestamp&gt;12345
As you can see SQL does support implicit conversion here, so you don't need the CONVERT in the WHERE clause (although it does not harm to put it there)

This will give you all records that has a numeric timestamp (version) greater than 12345. Notice, this will include all new rows as well as all changed rows. Put otherwise: The timestamp gets updated on all inserts and modifications.

Hope this will solve your scenario...

WOW!! if this is true it would be very very good, I'll try this:) thanks ThomasHej_MSFT:)

davmac1 · 2012-01-23

A method I successfully used was to add table triggers to store changes.
Using the easy way, using SQL to look at the inserted and deleted tables created problems by changing the timestamp.
Rashed Amini (Ara3n) on this forum advised me to use cursors instead which worked without problems.
This has been a while - it was SQL 2000 and NAV 4.01, but it would be worth trying with your version (providing you know a little SQL).

ThomasHej_MSFT · 2012-01-24

David Singleton wrote:

Is there any chance that RowVersion will be documented (aka supported) in future versions. For some clients it is hard (politically) to use a feature that is not supported.

Ok, let me be more precise. Timestamp and rowversion are exactly the same thing on SQL Server. The unfortunate situation is that ISO standards describe SQL Timestamps as something related to date/time whereas MSSQL server decided to implement this as a simple counter. DB2, Oracle and other relational databases uses an actual timestamp (point in time) for timestamp.
Over time this has caused some confusion since developers using other databases as "used" to be able to extract date/time info from a timestamp, which is not possible on MSSQL.
Therefore MSSQL introduced the more correct name Rowversion to describe the same thing.
Furthermore the syntax when creating a table with a timestamp differs a litte from syntax creating a table with at rowversion (name of the column not needed when using timestamp)

The feature as such is there to stay and can be safely used.

DenSter wrote:

...this produced the same results, so I am guessing it doesn't matter if you convert the value or not

Correct. The timestamp (being a counter) was originally created as a Binary[8], I assume that type was chosen since the big integer wasn't available before SQL Server 2000. Today it might as well have been a BigInt but is really does not matter since there is an implicit cast/convertion between bigint and binary[8].

DenSter wrote:

Will this work on large datasets too?

The obvious answer: Yes! timestamp will act as any "normal" column.

The thoughtful answer: Watch out here. Problem is that SQL does not implicitly create an index on the timestamp column, so the WHERE timestamp>12345 will enumerate all records in the table (tablescan) unless you filter by other criteria (then only the subset will be enumerated).

To overcome this, you could add an index (manually) including the timestamp field - this would definetely solve the SELECT ... WHERE issue, but will add additional index-overhead to ANY modification since the timestamp gets updated on each modification.

Where does that leave you?

If you need to track all changes any technique (matching lookup, adding changes to another table, tracking deletions) will all be expensive and an index on timestamp will be as bad as any other mechanism.

In this scenario however, where you ONLY need to track new records (inserts) you might consider other techniques (after all - a record only gets inserted once in its lifetime), but this will of course require changes to the code or adding extra fields to tables in question.

But as always - try it out and measure,measure,measure.. :-)

DenSter · 2012-01-24

Thanks for taking the time to write such a thorough reply, it really helps understand how this works.

:thumbsup:

mdPartnerNL · 2012-01-24

here here =D>

ndbcs · 2012-01-25

What about SQL Server Change Tracking (Standard Edition) or SQL Server Change Data Capture (Enterprise Edition only).
Those features were made for what you are looking for.

I think for your purposes CT is the smartest solution with the least amount of overhead on SQL Server.

I have worked with CT and the timestamp approach in real-life replication-like scenarios.
Both work pretty well.

I think the timestamp approach doesn't scale very well, because of the index-issue already mentioned.
In NAV 2009 R2 I would think about using the new global triggers instead of using the timestamp approach for small-scale solutions.
For the bigger-scale solutions I would use CT or CDC.

cheers
Tobias

ThomasHej_MSFT · 2012-01-25

ndbcs wrote:

What about SQL Server Change Tracking (Standard Edition) or SQL Server Change Data Capture (Enterprise Edition only).
Those features were made for what you are looking for.

Yes, they should also be evaluated in the general scenarios. (Thanks for pointing that out here!)

One need to realize however, that nothing is for free here. Both these techniques put the burden on SQL server to "track" the primary key or the entire record of changed records. This in turn means that every update will now generate inserts in tracking tables which again will require indexes (plural) to be updated.

Creating a single index on timestamp will - at least in theory - be more efficient, but this is exactly why one needs to measure... :-)

ndbcs · 2012-01-25

An index on the timestamp column has one entry for each record. The index includes the timestamp plus all clustered index fields.
The index will become fragmented very fast.

CT only holds records for modified records and it cleans up "old" records.

So the timestamp approach needs much more storage/ram/backup capacity, I think.
It also needs more ressources for index maintenance.

Because CT is made for those scenarios I think it also performs faster and with less possibility of blocking/deadlocking issues.

So, I guess in general the CT approach is much more efficient.

But, as you said and as always, you need to test it to be sure.

cheers
Tobias

Howdy, Stranger!

Quick Links

Categories

How identify new records in any navision table?

Comments