> My experience with that type of load process is that doing this > row-by-row is a very expensive approach and your results bear that out. I expected this, and had warned the client before the project started that this is exactly where SQL underperforms. > It is often better to write each step as an SQL statement that operates > on a set of rows at one time. The problem with this approach is that every row of data is dependent on the previous row's data being validated and imported. e.g. Import Row 1: John Q Smith Foobar Corp 123 Main St, Bigtown, MD 12345-6789 Import Row 2: John Quincy Smith FuzzyLoginc Inc 123 Main St, Suite 301 Bigtown, MD 12345-6789 Import Row 3: Bobby Jones Foobar Corp 123 Main Strett Suite 300, Bigtown, MD 12345 Every row must be imported into the table so that the next row may see the data and consider it when assigning ID's to the name, company and address. (all data must be normalised) How can this be done using set logic? > You can also improve performance by ordering your checks so that the > ones most likely to fail happen first. Already done - I believe the problem is definitely in the navigational access model. What I am doing now makes perfect sense as far as the logic of the process goes - any other developer will read it and understand what is going on. At 3000 lines of code, this will be tedious, but understandable. But SQL hates it. > Trying to achieve a high level of data quality in one large project is > not often possible. Focus on the most critical areas of checking and get > that working first with acceptable performance, then layer on additional > checks while tuning. The complexity of the load programs you have also > means they are susceptible to introducing data quality problems rather > than removing them, so an incremental approach will also aid debugging > of the load suite. I couldn't agree more. Carlo