Sub: SQL Server Query cost serial number  
Author Message
broogle





PostPosted: Wed Sep 26 05:16:01 PDT 2007 Top

SQL Server >> Sub: SQL Server Query cost serial number

We have a table with around 800,000 rows in our SQL Server 2005
database. Its got a column which
stores a char(6) serial number representing the product identification
number. We require to generate reports such
that we retrieve a certian range of serial numbers.
But out of the 6 characters 3 represent the product code,2 represent
the vendor code,1 represent
the color code. So we need to generate a query such that its

SELECT <ColumnNames>
FROM Parts_t
WHERE SUBSTRING(SerNum,1,3) BETWEEN '500' AND '599'
AND SUBSTRING(SerNum,4,2) BETWEEN '00' AND '99'
AND SUBSTRING(SerNum,6,1) BETWEEN '1' AND '1'

Whats the best way to implement this? Please suggest the indexes too.
Tried clustered index on
'SerNum' but our impression is once the 'SUBSTRING' function is used
on the column, the index
would not serve the purpose.

We also tried creating an indexed view with the parsed serial numbers
created and non-clustered
index created for each of the new columns 'Ser1','Ser2',Ser3'. Though
in the actual execution plan
the comparative cost of indexed view was only 25% with respect to the
query using the 'SUBSTRING'
from the actual table, the duration taken in the profiler was
contradictory(145 sec for indexed view
and 60 seconds for actual table). Any thoughts on this behaviour?

Thanks, in advance for any suggestions.


Thanks,
MPL

SQL Server274  
 
 
sachane





PostPosted: Wed Sep 26 05:16:01 PDT 2007 Top

SQL Server >> Sub: SQL Server Query cost serial number
Dear sir,

I have the sql quearies

pls

reply








> We have a table with around 800,000 rows in our SQL Server 2005
> database. Its got a column which
> stores a char(6) serial number representing the product identification
> number. We require to generate reports such
> that we retrieve a certian range of serial numbers.
> But out of the 6 characters 3 represent the product code,2 represent
> the vendor code,1 represent
> the color code. So we need to generate a query such that its
>
> SELECT <ColumnNames>
> FROM Parts_t
> WHERE SUBSTRING(SerNum,1,3) BETWEEN '500' AND '599'
> AND SUBSTRING(SerNum,4,2) BETWEEN '00' AND '99'
> AND SUBSTRING(SerNum,6,1) BETWEEN '1' AND '1'
>
> Whats the best way to implement this? Please suggest the indexes too.
> Tried clustered index on
> 'SerNum' but our impression is once the 'SUBSTRING' function is used
> on the column, the index
> would not serve the purpose.
>
> We also tried creating an indexed view with the parsed serial numbers
> created and non-clustered
> index created for each of the new columns 'Ser1','Ser2',Ser3'. Though
> in the actual execution plan
> the comparative cost of indexed view was only 25% with respect to the
> query using the 'SUBSTRING'
> from the actual table, the duration taken in the profiler was
> contradictory(145 sec for indexed view
> and 60 seconds for actual table). Any thoughts on this behaviour?
>
> Thanks, in advance for any suggestions.
>
> Thanks,
> MPL


 
 
TheSQLGuru





PostPosted: Wed Sep 26 07:37:16 PDT 2007 Top

SQL Server >> Sub: SQL Server Query cost serial number 1) Did you actually TEST to see if the index is used when you substring the
column? Pick a singular value for the first 3 characters and see if you get
an index seek. I have seen cases where sql server 2005 will use an index
even though a function is performed on the indexed column in the SARG.

2) If your range (500-599 for example) hits a sufficiently large number of
rows (between 10 and 15% usually) the optimizer will choose a table scan
since sequential I/O is so much more efficient than index seeks/bookmark
lookups.

3) You can also try computed columns that you then index. Again, choosing
values for a column that doesn't really limit the rows very much
(non-specific filters) mean indexes are not really useful.




> We have a table with around 800,000 rows in our SQL Server 2005
> database. Its got a column which
> stores a char(6) serial number representing the product identification
> number. We require to generate reports such
> that we retrieve a certian range of serial numbers.
> But out of the 6 characters 3 represent the product code,2 represent
> the vendor code,1 represent
> the color code. So we need to generate a query such that its
>
> SELECT <ColumnNames>
> FROM Parts_t
> WHERE SUBSTRING(SerNum,1,3) BETWEEN '500' AND '599'
> AND SUBSTRING(SerNum,4,2) BETWEEN '00' AND '99'
> AND SUBSTRING(SerNum,6,1) BETWEEN '1' AND '1'
>
> Whats the best way to implement this? Please suggest the indexes too.
> Tried clustered index on
> 'SerNum' but our impression is once the 'SUBSTRING' function is used
> on the column, the index
> would not serve the purpose.
>
> We also tried creating an indexed view with the parsed serial numbers
> created and non-clustered
> index created for each of the new columns 'Ser1','Ser2',Ser3'. Though
> in the actual execution plan
> the comparative cost of indexed view was only 25% with respect to the
> query using the 'SUBSTRING'
> from the actual table, the duration taken in the profiler was
> contradictory(145 sec for indexed view
> and 60 seconds for actual table). Any thoughts on this behaviour?
>
> Thanks, in advance for any suggestions.
>
>
> Thanks,
> MPL
>


 
 
mplpost





PostPosted: Thu Sep 27 00:41:20 PDT 2007 Top

SQL Server >> Sub: SQL Server Query cost serial number Hi,
Thanks a lot, for sparing the time to respond.

Sachane: I think i posted the query - Its

SELECT <ColumnNames>
FROM Parts_t
WHERE SUBSTRING(SerNum,1,3) BETWEEN '500' AND '599'
AND SUBSTRING(SerNum,4,2) BETWEEN '00' AND '99'
AND SUBSTRING(SerNum,6,1) BETWEEN '1' AND '1'


SQLGuru:
a)The execution plan is showing an Index scan on the index
created on column 'SerNum'. Are Index scans any better than table
scans? When we changed the 'SerNum' to a clustered index, the cost of
the table with the clustered index is higher than the table with non-
clustered index . Any thoughts on this?

c)Will try computed columns. But in our case, we are catering
to multiple clients. So each client will have different 'serial
number' identification substrings, that is, client1 will have the
first 3 characters as the vendor code, while client2 will have only
first 2 characters? Can we have a UDF as a computed column to
calculate and keep this as a new column?


Thanks,
MPL



> 1) Did you actually TEST to see if the index is used when you substring the
> column? Pick a singular value for the first 3 characters and see if you get
> an index seek. I have seen cases wheresqlserver2005 will use an index
> even though a function is performed on the indexed column in the SARG.
>
> 2) If your range (500-599 for example) hits a sufficiently large number of
> rows (between 10 and 15% usually) the optimizer will choose a table scan
> since sequential I/O is so much more efficient than index seeks/bookmark
> lookups.
>
> 3) You can also try computed columns that you then index. Again, choosing
> values for a column that doesn't really limit the rows very much
> (non-specific filters) mean indexes are not really useful.
>

>

>
> > We have a table with around 800,000 rows in ourSQLServer2005
> > database. Its got a column which
> > stores a char(6) serial number representing the product identification
> > number. We require to generate reports such
> > that we retrieve a certian range of serial numbers.
> > But out of the 6 characters 3 represent the product code,2 represent
> > the vendor code,1 represent
> > the color code. So we need to generate aquerysuch that its
>
> > SELECT <ColumnNames>
> > FROM Parts_t
> > WHERE SUBSTRING(SerNum,1,3) BETWEEN '500' AND '599'
> > AND SUBSTRING(SerNum,4,2) BETWEEN '00' AND '99'
> > AND SUBSTRING(SerNum,6,1) BETWEEN '1' AND '1'
>
> > Whats the best way to implement this? Please suggest the indexes too.
> > Tried clustered index on
> > 'SerNum' but our impression is once the 'SUBSTRING' function is used
> > on the column, the index
> > would not serve the purpose.
>
> > We also tried creating an indexed view with the parsed serial numbers
> > created and non-clustered
> > index created for each of the new columns 'Ser1','Ser2',Ser3'. Though
> > in the actual execution plan
> > the comparativecostof indexed view was only 25% with respect to the
> >queryusing the 'SUBSTRING'
> > from the actual table, the duration taken in the profiler was
> > contradictory(145 sec for indexed view
> > and 60 seconds for actual table). Any thoughts on this behaviour?
>
> > Thanks, in advance for any suggestions.
>
> > Thanks,
> > MPL


 
 
mplpost





PostPosted: Thu Sep 27 00:57:25 PDT 2007 Top

SQL Server >> Sub: SQL Server Query cost serial number Hi,
Can some tuning experts shed light on this specific question.

I did create the copy of the table mentioned above and configured
different indexes on the 2nd one. When i ran the query on both the
tables and checked comparative query cost in the execution plan, the
query on table1 is 25% while table2 is 75%. But when i check the
duration in the profiler its just the opposite, the duration on table1
is 100seconds and table2 is 40seconds. My understanding was the cost
would tell it all, and a lower cost meant a faster query. Please shed
light on this mystery or am I getting the whole concept wrong.

Thanks,
MPL

 
 
TheSQLGuru





PostPosted: Thu Sep 27 07:18:54 PDT 2007 Top

SQL Server >> Sub: SQL Server Query cost serial number Percentages are not quite as important as the magnitude of the actual
numbers in this case.

An index scan on a clustered index is better than a table scan, assuming the
scan isn't scanning the entire table. A clustered index IS the table at the
data level.

nonclustered indexes will result in a bookmark lookup, which can be very
inefficient.

You cannot use a function to split the column as needed for different
clients (3 for one, 2 for another) produce a useful index on that computed
column. You would need multiple computed columns/indexes in this case.



> Hi,
> Can some tuning experts shed light on this specific question.
>
> I did create the copy of the table mentioned above and configured
> different indexes on the 2nd one. When i ran the query on both the
> tables and checked comparative query cost in the execution plan, the
> query on table1 is 25% while table2 is 75%. But when i check the
> duration in the profiler its just the opposite, the duration on table1
> is 100seconds and table2 is 40seconds. My understanding was the cost
> would tell it all, and a lower cost meant a faster query. Please shed
> light on this mystery or am I getting the whole concept wrong.
>
> Thanks,
> MPL
>


 
 
mplpost





PostPosted: Thu Sep 27 20:59:25 PDT 2007 Top

SQL Server >> Sub: SQL Server Query cost serial number Hi,
SQLGuru, thanks for responding. Did u mean, regarding the relative
query costs, if we are checking only the overall query time taken
there is no relevance to the "comparative % cost" specified in the
execution plans. That is I have to always check only the duration in
the profiler, and never check the %cost in the execution plan, when I
am just concerned about the query output time.

Is Tibor Karaszi out there, love to have some comments from him
too:)

Thanks,
MPL

 
 
Tibor





PostPosted: Fri Sep 28 00:48:11 PDT 2007 Top

SQL Server >> Sub: SQL Server Query cost serial number Check out http://blogs.msdn.com/sqlqueryprocessing/archive/2006/10/11/What-is-this-cost.aspx
regarding the cost numbers...

--
Tibor Karaszi, SQL Server MVP
http://www.karaszi.com/sqlserver/default.asp
http://sqlblog.com/blogs/tibor_karaszi



> Hi,
> SQLGuru, thanks for responding. Did u mean, regarding the relative
> query costs, if we are checking only the overall query time taken
> there is no relevance to the "comparative % cost" specified in the
> execution plans. That is I have to always check only the duration in
> the profiler, and never check the %cost in the execution plan, when I
> am just concerned about the query output time.
>
> Is Tibor Karaszi out there, love to have some comments from him
> too:)
>
> Thanks,
> MPL
>

 
 
Gert-Jan





PostPosted: Fri Sep 28 12:33:13 PDT 2007 Top

SQL Server >> Sub: SQL Server Query cost serial number MPL,

In this case, the optimal configuration depends on a few things:
1. The total number of rows and table size, and the number of rows
selected by the query
2. The selectivity of each of the 3 parts of the key

If you select relatively many rows, then a nonclustered index is of no
use. If you select relatively few rows, then the proper nonclustered
index might be your best bet.

Your queries shows filtering on three ranges. Because of that, a
clustered compound index is of limited use because the second and third
range can only be scanned in such an index. If a nonclustered index is
used, and the second and third range eliminate a significant number of
rows, then a compound index makes sense. If not, then a single column
index is better.

It is important to know which range is the most selective. If the filter
on the first 3 characters is the most selective, then it is easy, you
index the column and use SerNum BETWEEN '500' AND '599' + 'Z' in the
WHERE clause.

If one of the two other filters is more selective, then you could
consider creating a computed column and indexing that. Of course, the
computed column scenario would require selecting from the computed
column. Since you are only filtering on 2 characters it is highly
unlikely that filtering on 2 characters is very selective.

When testing, you need to determine if you want to test with a cold
cache or hot cache. Whichever you choose, you have to make sure that you
test each scenario exactly the same. So either clear the cache before
each try, or prime the cache before each run (for example by running the
query twice). For most acurate results you must use the actual
performance stats (use SELECT GETDATE() or use SQL Profiler). The query
plan will only show estimates of performance.


If you provide the data in points [1] and [2] maybe I can be more
specific.

HTH
--
Gert-Jan



>
> Hi,
> Can some tuning experts shed light on this specific question.
>
> I did create the copy of the table mentioned above and configured
> different indexes on the 2nd one. When i ran the query on both the
> tables and checked comparative query cost in the execution plan, the
> query on table1 is 25% while table2 is 75%. But when i check the
> duration in the profiler its just the opposite, the duration on table1
> is 100seconds and table2 is 40seconds. My understanding was the cost
> would tell it all, and a lower cost meant a faster query. Please shed
> light on this mystery or am I getting the whole concept wrong.
>
> Thanks,
> MPL
 
 
mplpost





PostPosted: Sun Sep 30 22:59:22 PDT 2007 Top

SQL Server >> Sub: SQL Server Query cost serial number Hi,

Thanks a lot Gert and to all others who have responded. Hope to get
more responses.

We would like to make one point clear that always my first(none of
them) SARG would not be necessarily be '500' to '599' it could be
'523' to '561' or '531' to '531' , its user defined. So we cannot
generalize it as a like '5%%'.

We are posting these messages only in this thread, but we found the
sames postings in http://www.developersdex.com/sql/message.asp?p=580&r=5965267
too.
So we missed out the reply from Tibor in here. Tibor, if its not much
of a problem could u post the replies in here only so we dont miss out
on the valuable replies u post.



Thanks,
MPL.