Creating index in a large table  
Author Message
hboss24





PostPosted: Tue Jan 10 18:04:14 CST 2006 Top

SQL Server >> Creating index in a large table

Hello:

We have a table with millions of records. Not long ago, we added a new
column to this large table, and we like to populate this new column with data
and create an index for it.

Would there be any time difference if we create an index (non-clustered)
before we populate it with data, or after we populate it with data? In
another word, is it true that it takes lesser time to create an index in a
column with only null?

Thanks,

Q

SQL Server249  
 
 
David





PostPosted: Tue Jan 10 18:04:14 CST 2006 Top

SQL Server >> Creating index in a large table Q:
I would hazard a guess that you are better off to populate the new column
with data and then create the non-clustered index rather than the other way
around. (Assumption: you have an existing Clustered Index on the table that
is hopefully on an integer field)

Rationale: When you create a non-clustered index on a table that has a
clustered index you create a B-Tree structure that has the cluster keys.

If you create the index and then populate, you are making the updates work
harder because it has to update the leaf level of the clustered index (where
the actual data is) as well as insert or manipulate the non-clustered index.
As updates occur your index will have page splits and the index pages will
become fragmented.

If you populate and then create the index you will create the index B-Tree
structure all in one shot and it will not be fragmented.

--
David Lundell
Principal Consultant and Trainer
www.MutuallyBeneficial.com




> Hello:
>
> We have a table with millions of records. Not long ago, we added a new
> column to this large table, and we like to populate this new column with
> data
> and create an index for it.
>
> Would there be any time difference if we create an index (non-clustered)
> before we populate it with data, or after we populate it with data? In
> another word, is it true that it takes lesser time to create an index in a
> column with only null?
>
> Thanks,
>
> Q


 
 
Q





PostPosted: Tue Jan 10 18:28:02 CST 2006 Top

SQL Server >> Creating index in a large table Hello David:

Thanks for your detail reply. Good to know that it would create more work
if there is already a clustered index and then if we add a new column, a
non-clustered index to the new column, then populate data to this new column.

There is no clustered index in this table. So can I assume there is no
overhead in insert/update, so there will not be fragmentation in split index
page?

Thanks

Q



> Q:
> I would hazard a guess that you are better off to populate the new column
> with data and then create the non-clustered index rather than the other way
> around. (Assumption: you have an existing Clustered Index on the table that
> is hopefully on an integer field)
>
> Rationale: When you create a non-clustered index on a table that has a
> clustered index you create a B-Tree structure that has the cluster keys.
>
> If you create the index and then populate, you are making the updates work
> harder because it has to update the leaf level of the clustered index (where
> the actual data is) as well as insert or manipulate the non-clustered index.
> As updates occur your index will have page splits and the index pages will
> become fragmented.
>
> If you populate and then create the index you will create the index B-Tree
> structure all in one shot and it will not be fragmented.
>
> --
> David Lundell
> Principal Consultant and Trainer
> www.MutuallyBeneficial.com

>


> > Hello:
> >
> > We have a table with millions of records. Not long ago, we added a new
> > column to this large table, and we like to populate this new column with
> > data
> > and create an index for it.
> >
> > Would there be any time difference if we create an index (non-clustered)
> > before we populate it with data, or after we populate it with data? In
> > another word, is it true that it takes lesser time to create an index in a
> > column with only null?
> >
> > Thanks,
> >
> > Q
>
>
>
 
 
bradsbulkmail





PostPosted: Tue Jan 10 18:34:04 CST 2006 Top

SQL Server >> Creating index in a large table Create the index *after* updating the column with data. Creating the
index first will add overhead during the update process.




>Hello:
>
>We have a table with millions of records. Not long ago, we added a new
>column to this large table, and we like to populate this new column with data
>and create an index for it.
>
>Would there be any time difference if we create an index (non-clustered)
>before we populate it with data, or after we populate it with data? In
>another word, is it true that it takes lesser time to create an index in a
>column with only null?
>
>Thanks,
>
>Q
 
 
David





PostPosted: Tue Jan 10 21:14:37 CST 2006 Top

SQL Server >> Creating index in a large table Q:

You actually still get the overhead even with a non-clustered index on a
table that does not have a clustered index. Although it is fairly unusual
for heavily used tables to not have a clustered index. It would be wise to
take an overall look at your indexing strategy


--
David Lundell
Principal Consultant and Trainer
www.MutuallyBeneficial.com




> Hello David:
>
> Thanks for your detail reply. Good to know that it would create more work
> if there is already a clustered index and then if we add a new column, a
> non-clustered index to the new column, then populate data to this new
> column.
>
> There is no clustered index in this table. So can I assume there is no
> overhead in insert/update, so there will not be fragmentation in split
> index
> page?
>
> Thanks
>
> Q
>

>
>> Q:
>> I would hazard a guess that you are better off to populate the new column
>> with data and then create the non-clustered index rather than the other
>> way
>> around. (Assumption: you have an existing Clustered Index on the table
>> that
>> is hopefully on an integer field)
>>
>> Rationale: When you create a non-clustered index on a table that has a
>> clustered index you create a B-Tree structure that has the cluster keys.
>>
>> If you create the index and then populate, you are making the updates
>> work
>> harder because it has to update the leaf level of the clustered index
>> (where
>> the actual data is) as well as insert or manipulate the non-clustered
>> index.
>> As updates occur your index will have page splits and the index pages
>> will
>> become fragmented.
>>
>> If you populate and then create the index you will create the index
>> B-Tree
>> structure all in one shot and it will not be fragmented.
>>
>> --
>> David Lundell
>> Principal Consultant and Trainer
>> www.MutuallyBeneficial.com

>>


>> > Hello:
>> >
>> > We have a table with millions of records. Not long ago, we added a new
>> > column to this large table, and we like to populate this new column
>> > with
>> > data
>> > and create an index for it.
>> >
>> > Would there be any time difference if we create an index
>> > (non-clustered)
>> > before we populate it with data, or after we populate it with data? In
>> > another word, is it true that it takes lesser time to create an index
>> > in a
>> > column with only null?
>> >
>> > Thanks,
>> >
>> > Q
>>
>>
>>


 
 
David





PostPosted: Tue Jan 10 23:54:01 CST 2006 Top

SQL Server >> Creating index in a large table
> Hello:
>
> We have a table with millions of records. Not long ago, we added a
> new column to this large table, and we like to populate this new
> column with data and create an index for it.
>
> Would there be any time difference if we create an index
> (non-clustered) before we populate it with data, or after we populate
> it with data? In another word, is it true that it takes lesser time
> to create an index in a column with only null?
>
> Thanks,
>
> Q

Definitely update before the index. The update process is going to cause
excessive logging because of the number of rows - which means you should
update in batches (say 100,000 rows at a time if possible). Each update will
require a corresponding index update if th eindex is in place, causing even
more work. Create the index after the column is loaded with data for the
rows in question.

--
David Gugick
Quest Software


 
 
Q





PostPosted: Wed Jan 11 14:41:02 CST 2006 Top

SQL Server >> Creating index in a large table Thanks!




> > Hello:
> >
> > We have a table with millions of records. Not long ago, we added a
> > new column to this large table, and we like to populate this new
> > column with data and create an index for it.
> >
> > Would there be any time difference if we create an index
> > (non-clustered) before we populate it with data, or after we populate
> > it with data? In another word, is it true that it takes lesser time
> > to create an index in a column with only null?
> >
> > Thanks,
> >
> > Q
>
> Definitely update before the index. The update process is going to cause
> excessive logging because of the number of rows - which means you should
> update in batches (say 100,000 rows at a time if possible). Each update will
> require a corresponding index update if th eindex is in place, causing even
> more work. Create the index after the column is loaded with data for the
> rows in question.
>
> --
> David Gugick
> Quest Software
>
>
>
 
 
Q





PostPosted: Wed Jan 11 14:41:03 CST 2006 Top

SQL Server >> Creating index in a large table Thanks!



> Create the index *after* updating the column with data. Creating the
> index first will add overhead during the update process.
>


>
> >Hello:
> >
> >We have a table with millions of records. Not long ago, we added a new
> >column to this large table, and we like to populate this new column with data
> >and create an index for it.
> >
> >Would there be any time difference if we create an index (non-clustered)
> >before we populate it with data, or after we populate it with data? In
> >another word, is it true that it takes lesser time to create an index in a
> >column with only null?
> >
> >Thanks,
> >
> >Q
>