Discussion:
Using SSIS for near-real time ETL
(too old to reply)
Damon Allison
2006-01-16 21:46:04 UTC
Permalink
Hello,

I am trying to create a near-realtime ETL (extract / transform / load).
(near time being < 2 min) When a database table is inserted/updated, I
would like to have that data transformed and sent to another database.

I have looked into using the service broker and have a working solution,
however it pushes service broker to the limit. In particular, it is
difficult to maintain transactional consistency and ordering (statement
1 and statement 2 fire on the other db in the same order - in SB, I need
to use the same dialog for all messages).

The transforms could potentially be intensive, however typically not.

http://groups.google.com/group/microsoft.public.sqlserver.programming/browse_frm/thread/6c7a3f6ae3126b1d/06df8654a4e975a3

Would SSIS work for me? It appears SSIS is geared around more batch
driven ETL rather than near real-time. Also I cannot find a way to fire
an SSIS job via T-SQL, how would I tell the job to fire (and pass
variables to the job).

I realize that SSIS is probably not meant for such a task, I just
thought I'd ask. Are their other solutions besides service broker and
SSIS that would provide ETL type functionality?

Thanks -
Damon
Jéjé
2006-01-16 22:52:29 UTC
Permalink
SSIS will be great if you have to synchronize multiple sources, apply some
transformations and do multiple tasks (processing cubes, send emails, log
activities etc...)
SSIS can load data from a source apply transformations and load the dataset
directly in an OLAP partition!!! without any intermediate storage so you can
update a cube more quickly.
(I have not tested this feature)

But if you have to load data from 1 source to another without these
requirements, then create some TSQL scripts to load your data or create some
triggers directly in the source database.

In the past I have created DTS 2000 packages which load data from 4 sources
(1 SQL Server, 1 DBF, 1 Excel and 1 FlatFile) and process a cube every 5
minutes... and DTS works fine for this job.
So I presume that SSIS provides better results and more options.

To starts you SSIS job, you can create a schedule using the SQL agent and
starts this job using a TSQL statement (sp_job_start or anything like this)
To send some variables in this usage, I recommend to store the variables in
a database, then tell SSIS to read this table.
So, first step: fill the parameter table; second step: execute the job

SSIS packages are executed using a simple command line and you can provide
some parameters to these commands.

Look at the help to learn more about this.
Post by Damon Allison
Hello,
I am trying to create a near-realtime ETL (extract / transform / load).
(near time being < 2 min) When a database table is inserted/updated, I
would like to have that data transformed and sent to another database.
I have looked into using the service broker and have a working solution,
however it pushes service broker to the limit. In particular, it is
difficult to maintain transactional consistency and ordering (statement 1
and statement 2 fire on the other db in the same order - in SB, I need to
use the same dialog for all messages).
The transforms could potentially be intensive, however typically not.
http://groups.google.com/group/microsoft.public.sqlserver.programming/browse_frm/thread/6c7a3f6ae3126b1d/06df8654a4e975a3
Would SSIS work for me? It appears SSIS is geared around more batch
driven ETL rather than near real-time. Also I cannot find a way to fire
an SSIS job via T-SQL, how would I tell the job to fire (and pass
variables to the job).
I realize that SSIS is probably not meant for such a task, I just thought
I'd ask. Are their other solutions besides service broker and SSIS that
would provide ETL type functionality?
Thanks -
Damon
Loading...