Trying to remove multiple spaces in a string |
|
Author |
Message |
acarr_fss
|
Posted: Fri Jan 30 15:07:39 CST 2004 |
Top |
SQL Server Developer >> Trying to remove multiple spaces in a string
I have some values that I need to convert a varring number of spaces
between words and letters to one space. Is there a way to do this other
than running an update similar to the following until no more rows are
updated?
Update tblX
Set colX = replace(colX, ' ', ' ')
Thanks much
Carl
*** Sent via Developersdex http://www.hide-link.com/ ***
Don't just participate in USENET...get rewarded for it!
SQL Server107
|
|
|
|
|
Louis
|
Posted: Fri Jan 30 15:07:39 CST 2004 |
Top |
SQL Server Developer >> Trying to remove multiple spaces in a string
In this case you might want to build a UDF like:
create function string$singleSpaceReplace
(
@inputStr varchar(8000)
)
returns varchar(8000)
as
begin
--look for double space, replace with single space
end
go
Returns:
Hi this string is out of control
This will be perfece in the SET or SELECT clause of a query. There might be
a more efficient way to do it, but I figured that just replacing every
double string with a single would give at least good enough results,
depending on how many spaces you are dealing with. Of course, best practice
would be to apply a check constraint on the table after you have cleaned up
the data and you won't have to go here ever again:
create table tableX
(
colX varchar(100)
)
alter table tableX
add constraint colX_singleSpaced check (charindex(' ',colX) = 0)
Now, try to insert the funky string:
insert into tableX
values ('Hi this string is out of control')
Server: Msg 547, Level 16, State 1, Line 1
INSERT statement conflicted with COLUMN CHECK constraint
'colX_singleSpaced'. The conflict occurred in database 'tempdb', table
'tableX', column 'colX'.
The statement has been terminated.
insert into tableX
values ('Hi this string is out of control')
Works fine. Now you only have to think of this once (if this fits your
requirements of course) and it forces the UI developers to only put good
data into the tables.
--
----------------------------------------------------------------------------
-----------
Compass Technology Management
Pro SQL Server 2000 Database Design
http://www.apress.com/book/bookDisplay.html?bID=266
Note: Please reply to the newsgroups only unless you are
interested in consulting services. All other replies will be ignored :)
> I have some values that I need to convert a varring number of spaces
> between words and letters to one space. Is there a way to do this other
> than running an update similar to the following until no more rows are
> updated?
>
> Update tblX
> Set colX = replace(colX, ' ', ' ')
>
> Thanks much
> Carl
>
> *** Sent via Developersdex http://www.developersdex.com ***
> Don't just participate in USENET...get rewarded for it!
|
|
|
|
|
James
|
Posted: Fri Jan 30 15:14:09 CST 2004 |
Top |
SQL Server Developer >> Trying to remove multiple spaces in a string
Carl,
Just doing it once will be enough.
Thought: 3 spaces in a row, this is merely 3 occurrences of a single space,
and each will be 'replaced' with nothing, thereby removing all 3 in one
swoop.
Go ahead and run your update as is.
One time is all it will take.
James Hokes
> I have some values that I need to convert a varring number of spaces
> between words and letters to one space. Is there a way to do this other
> than running an update similar to the following until no more rows are
> updated?
>
> Update tblX
> Set colX = replace(colX, ' ', ' ')
>
> Thanks much
> Carl
>
> *** Sent via Developersdex http://www.developersdex.com ***
> Don't just participate in USENET...get rewarded for it!
|
|
|
|
|
Louis
|
Posted: Fri Jan 30 15:52:53 CST 2004 |
Top |
SQL Server Developer >> Trying to remove multiple spaces in a string
He wanted to replace 2-N spaces with a single space, not get rid of all
spaces.
--
----------------------------------------------------------------------------
-----------
Compass Technology Management
Pro SQL Server 2000 Database Design
http://www.apress.com/book/bookDisplay.html?bID=266
Note: Please reply to the newsgroups only unless you are
interested in consulting services. All other replies will be ignored :)
> Carl,
>
> Just doing it once will be enough.
>
> Thought: 3 spaces in a row, this is merely 3 occurrences of a single
space,
> and each will be 'replaced' with nothing, thereby removing all 3 in one
> swoop.
>
> Go ahead and run your update as is.
> One time is all it will take.
>
> James Hokes
>
> > I have some values that I need to convert a varring number of spaces
> > between words and letters to one space. Is there a way to do this other
> > than running an update similar to the following until no more rows are
> > updated?
> >
> > Update tblX
> > Set colX = replace(colX, ' ', ' ')
> >
> > Thanks much
> > Carl
> >
> > *** Sent via Developersdex http://www.developersdex.com ***
> > Don't just participate in USENET...get rewarded for it!
>
>
|
|
|
|
|
James
|
Posted: Fri Jan 30 17:19:09 CST 2004 |
Top |
SQL Server Developer >> Trying to remove multiple spaces in a string
Louis,
> He wanted to replace 2-N spaces with a single space, not get rid of all
> spaces.
Nice catch.
James Hokes
|
|
|
|
|
Joe
|
Posted: Sat Jan 31 18:00:18 CST 2004 |
Top |
SQL Server Developer >> Trying to remove multiple spaces in a string
>> need to convert a varying number of spaces between words and letters
to one space. <<
You can nest function calls up to 32 levels deep, so just expand what
you already have done:
UPDATE Foobar
SET col_x
= REPLACE (
REPLACE (
REPLACE (
..
REPLACE(col_x, ' ', ' ')
..
' ', ' ')
' ', ' ')
' ', ' ');
This is a LOT faster than**** in a loop and it is pure SQL, not
proprietary, procedural code.
Matb problem for you: let col_x be VARCHAR(n). What is the optimal mix
of replacement strings for reducing the number of spaces?
Let (j->k) mean "replaces (j) spaces with (k) spaces"
a) (2->1) for log2(n) ?
b) Is it best to always have (k->1)?
c) (floor(sqrt(n)) -> 1) as a starter?
I am in a weird mood this afternoon.
--CELKO--
===========================
Please post DDL, so that people do not have to guess what the keys,
constraints, Declarative Referential Integrity, datatypes, etc. in your
schema are.
*** Sent via Developersdex http://www.hide-link.com/ ***
Don't just participate in USENET...get rewarded for it!
|
|
|
|
|
Louis
|
Posted: Sat Jan 31 19:28:31 CST 2004 |
Top |
SQL Server Developer >> Trying to remove multiple spaces in a string
Yes it is a bit faster (I tested :) but not all that much more. Things will
get interesting when we can build our own functions in .NET.
One thing though, there is not a 32 level nest limit on built-in functions.
The following example supports 100 replaces:
select
replace(replace(replace(replace(replace(replace(replace(replace(replace(repl
ace(
replace(replace(replace(replace(replace(replace(replace(replace(replace(repl
ace(
replace(replace(replace(replace(replace(replace(replace(replace(replace(repl
ace(
replace(replace(replace(replace(replace(replace(replace(replace(replace(repl
ace(
replace(replace(replace(replace(replace(replace(replace(replace(replace(repl
ace(
replace(replace(replace(replace(replace(replace(replace(replace(replace(repl
ace(
replace(replace(replace(replace(replace(replace(replace(replace(replace(repl
ace(
replace(replace(replace(replace(replace(replace(replace(replace(replace(repl
ace(
replace(replace(replace(replace(replace(replace(replace(replace(replace(repl
ace(
replace(replace(replace(replace(replace(replace(replace(replace(replace(repl
ace(
'Remove the spaces'
,' ',' '),' ',' '),' ',' '),' ',' '),' ',' '),' ',' '),' ',' '),'
',' '),' ',' '),' ',' ')
,' ',' '),' ',' '),' ',' '),' ',' '),' ',' '),' ',' '),' ',' '),'
',' '),' ',' '),' ',' ')
,' ',' '),' ',' '),' ',' '),' ',' '),' ',' '),' ',' '),' ',' '),'
',' '),' ',' '),' ',' ')
,' ',' '),' ',' '),' ',' '),' ',' '),' ',' '),' ',' '),' ',' '),'
',' '),' ',' '),' ',' ')
,' ',' '),' ',' '),' ',' '),' ',' '),' ',' '),' ',' '),' ',' '),'
',' '),' ',' '),' ',' ')
,' ',' '),' ',' '),' ',' '),' ',' '),' ',' '),' ',' '),' ',' '),'
',' '),' ',' '),' ',' ')
,' ',' '),' ',' '),' ',' '),' ',' '),' ',' '),' ',' '),' ',' '),'
',' '),' ',' '),' ',' ')
,' ',' '),' ',' '),' ',' '),' ',' '),' ',' '),' ',' '),' ',' '),'
',' '),' ',' '),' ',' ')
,' ',' '),' ',' '),' ',' '),' ',' '),' ',' '),' ',' '),' ',' '),'
',' '),' ',' '),' ',' ')
,' ',' '),' ',' '),' ',' '),' ',' '),' ',' '),' ',' '),' ',' '),'
',' '),' ',' '),' ',' ')
Thanks
--
----------------------------------------------------------------------------
-----------
Compass Technology Management
Pro SQL Server 2000 Database Design
http://www.hide-link.com/
Note: Please reply to the newsgroups only unless you are
interested in consulting services. All other replies will be ignored :)
> >> need to convert a varying number of spaces between words and letters
> to one space. <<
>
> You can nest function calls up to 32 levels deep, so just expand what
> you already have done:
>
> UPDATE Foobar
> SET col_x
> = REPLACE (
> REPLACE (
> REPLACE (
> ..
> REPLACE(col_x, ' ', ' ')
> ..
> ' ', ' ')
> ' ', ' ')
> ' ', ' ');
>
> This is a LOT faster than**** in a loop and it is pure SQL, not
> proprietary, procedural code.
>
> Matb problem for you: let col_x be VARCHAR(n). What is the optimal mix
> of replacement strings for reducing the number of spaces?
>
> Let (j->k) mean "replaces (j) spaces with (k) spaces"
>
> a) (2->1) for log2(n) ?
> b) Is it best to always have (k->1)?
> c) (floor(sqrt(n)) -> 1) as a starter?
>
> I am in a weird mood this afternoon.
>
> --CELKO--
> ===========================
> Please post DDL, so that people do not have to guess what the keys,
> constraints, Declarative Referential Integrity, datatypes, etc. in your
> schema are.
>
> *** Sent via Developersdex http://www.hide-link.com/ ***
> Don't just participate in USENET...get rewarded for it!
|
|
|
|
|
lindawie
|
Posted: Sat Jan 31 19:23:41 CST 2004 |
Top |
SQL Server Developer >> Trying to remove multiple spaces in a string
Joe,
> You can nest function calls up to 32 levels deep, so just expand
> what you already have done:
RTFM. That's not correct. The maximum nesting level of 32 applies to
stored procedures. It does not apply to builtin functions such as
replace().
http://google.com/groups?selm=OiukEDWKDHA.1372%40TK2MSFTNGP12.phx.gbl
This is something you should know if you are using SQL Server for
teaching.
Linda
|
|
|
|
|
Gert-Jan
|
Posted: Sun Feb 01 16:39:43 CST 2004 |
Top |
SQL Server Developer >> Trying to remove multiple spaces in a string
The statement below will take care of all excess spaces in a varchar
column:
update tblX
set col1 =
Replace(
Replace(
Replace(
Replace(
Replace(
Replace(colX, replicate(' ',132),' ')
, replicate(' ', 21),' ')
, replicate(' ', 6),' ')
, replicate(' ', 3),' ')
, replicate(' ', 2),' ')
, replicate(' ', 2),' ')
Hope this helps,
Gert-Jan
>
> I have some values that I need to convert a varring number of spaces
> between words and letters to one space. Is there a way to do this other
> than running an update similar to the following until no more rows are
> updated?
>
> Update tblX
> Set colX = replace(colX, ' ', ' ')
>
> Thanks much
> Carl
>
> *** Sent via Developersdex http://www.developersdex.com ***
> Don't just participate in USENET...get rewarded for it!
|
|
|
|
|
Louis
|
Posted: Sun Feb 01 22:54:06 CST 2004 |
Top |
SQL Server Developer >> Trying to remove multiple spaces in a string
Please explain how? I am pretty sure I understand, and I don't doubt it
(well, after I tested it in the following loop :) but it is very cool.
Also, can you tell me where you got this? Like was it yours? I want to
make sure that I give credit when I mention it to others.
begin
replicate(' ',132),' ')
, replicate(' ', 21),' ')
, replicate(' ', 6),' ')
, replicate(' ', 3),' ')
, replicate(' ', 2),' ')
, replicate(' ', 2),' ') <> '* *'
begin
end
end
--
----------------------------------------------------------------------------
-----------
Compass Technology Management
Pro SQL Server 2000 Database Design
http://www.apress.com/book/bookDisplay.html?bID=266
Note: Please reply to the newsgroups only unless you are
interested in consulting services. All other replies will be ignored :)
> The statement below will take care of all excess spaces in a varchar
> column:
>
> update tblX
> set col1 =
> Replace(
> Replace(
> Replace(
> Replace(
> Replace(
> Replace(colX, replicate(' ',132),' ')
> , replicate(' ', 21),' ')
> , replicate(' ', 6),' ')
> , replicate(' ', 3),' ')
> , replicate(' ', 2),' ')
> , replicate(' ', 2),' ')
>
> Hope this helps,
> Gert-Jan
>
>
> >
> > I have some values that I need to convert a varring number of spaces
> > between words and letters to one space. Is there a way to do this other
> > than running an update similar to the following until no more rows are
> > updated?
> >
> > Update tblX
> > Set colX = replace(colX, ' ', ' ')
> >
> > Thanks much
> > Carl
> >
> > *** Sent via Developersdex http://www.developersdex.com ***
> > Don't just participate in USENET...get rewarded for it!
|
|
|
|
|
Steve
|
Posted: Sun Feb 01 23:53:40 CST 2004 |
Top |
SQL Server Developer >> Trying to remove multiple spaces in a string
There have been past discussions about best sequences of replacement
lengths to use. Here's one thread, which includes my favorite solution
to this question, posted by Robert Carnegie.
(http://groups.google.com/groups?q=carnegie+berlyant+doing)
Robert's solution:
If there are characters such as < and > which the string is
guaranteed not to contain,
replace(
replace(
' ' , '<>'),
'><', '' ),
'<>', ' ' )
(Think about it...) In fact, you only need one character not already
in the string, plus space...
SK
>Please explain how? I am pretty sure I understand, and I don't doubt it
>(well, after I tested it in the following loop :) but it is very cool.
>Also, can you tell me where you got this? Like was it yours? I want to
>make sure that I give credit when I mention it to others.
>
>
>
> begin
> replicate(' ',132),' ')
> , replicate(' ', 21),' ')
> , replicate(' ', 6),' ')
> , replicate(' ', 3),' ')
> , replicate(' ', 2),' ')
> , replicate(' ', 2),' ') <> '* *'
> begin
> end
>
> end
>
>
>
>
>
|
|
|
|
|
Louis
|
Posted: Mon Feb 02 10:16:16 CST 2004 |
Top |
SQL Server Developer >> Trying to remove multiple spaces in a string
That is okay, but I don't like the concept of characters that the string is
guaranteed not to have, though it is really neat. The other solution is
technically better, just by a nose.
I think this is really cool. I had never really considered how to solve it
using replaces, especially when I can build my own little loop to solve the
problem in a function. The solution that came to mind immediately was a
recursive kind of thing like the replace thing that Celko posted, but we
have that whole 32 recursive calls limit, so the loop came to mind.
The loop function is not all that slow, and I would have probably favored it
over the multiple calls to the replace (' ', ' ') function, just because it
was not guaranteed to work (what if there is one more instance than you have
replaces? Plus, if you put that call in a check constraint, the time to
execute is not even measurable using the getdate() function.
--
----------------------------------------------------------------------------
-----------
Compass Technology Management
Pro SQL Server 2000 Database Design
http://www.apress.com/book/bookDisplay.html?bID=266
Note: Please reply to the newsgroups only unless you are
interested in consulting services. All other replies will be ignored :)
> There have been past discussions about best sequences of replacement
> lengths to use. Here's one thread, which includes my favorite solution
> to this question, posted by Robert Carnegie.
> (http://groups.google.com/groups?q=carnegie+berlyant+doing)
>
> Robert's solution:
>
> If there are characters such as < and > which the string is
> guaranteed not to contain,
>
> replace(
> replace(
> ' ' , '<>'),
> '><', '' ),
> '<>', ' ' )
> (Think about it...) In fact, you only need one character not already
> in the string, plus space...
>
> SK
>
>
>
> >Please explain how? I am pretty sure I understand, and I don't doubt it
> >(well, after I tested it in the following loop :) but it is very cool.
> >Also, can you tell me where you got this? Like was it yours? I want to
> >make sure that I give credit when I mention it to others.
> >
> >
> >
> > begin
> > replicate(' ',132),' ')
> > , replicate(' ', 21),' ')
> > , replicate(' ', 6),' ')
> > , replicate(' ', 3),' ')
> > , replicate(' ', 2),' ')
> > , replicate(' ', 2),' ') <> '* *'
> > begin
> > end
> >
> > end
> >
> >
> >
> >
> >
>
|
|
|
|
|
John
|
Posted: Mon Feb 02 12:42:04 CST 2004 |
Top |
SQL Server Developer >> Trying to remove multiple spaces in a string
> That is okay, but I don't like the concept of characters that the string is
> guaranteed not to have, though it is really neat. The other solution is
> technically better, just by a nose.
>
> I think this is really cool. I had never really considered how to solve it
> using replaces, especially when I can build my own little loop to solve the
> problem in a function. The solution that came to mind immediately was a
> recursive kind of thing like the replace thing that Celko posted, but we
> have that whole 32 recursive calls limit, so the loop came to mind.
>
> The loop function is not all that slow, and I would have probably favored it
> over the multiple calls to the replace (' ', ' ') function, just because it
> was not guaranteed to work (what if there is one more instance than you have
> replaces? Plus, if you put that call in a check constraint, the time to
> execute is not even measurable using the getdate() function.
If using nested calls to REPLACE, where each function call replaces
' ' (2 spaces) with ' ' (1 space), then N such nested calls can replace
2^N spaces with 1. Since a VARCHAR value's maximum length is 8000
characters, which is just under 2^13, a mere 13 calls will handle
the maximum number of successive spaces. For VARCHAR(N),
the number of nested REPLACE calls needed is therefore
CEILING(LOG2(N)). It also puts back into the realm of possibility
recursive solutions as 32 levels is more than enough. Ya gotta
love divide-and-conquer.
Regards,
jag
> --
> ----------------------------------------------------------------------------
> -----------
> Compass Technology Management
>
> Pro SQL Server 2000 Database Design
> http://www.apress.com/book/bookDisplay.html?bID=266
>
> Note: Please reply to the newsgroups only unless you are
> interested in consulting services. All other replies will be ignored :)
>
> > There have been past discussions about best sequences of replacement
> > lengths to use. Here's one thread, which includes my favorite solution
> > to this question, posted by Robert Carnegie.
> > (http://groups.google.com/groups?q=carnegie+berlyant+doing)
> >
> > Robert's solution:
> >
> > If there are characters such as < and > which the string is
> > guaranteed not to contain,
> >
> > replace(
> > replace(
> > ' ' , '<>'),
> > '><', '' ),
> > '<>', ' ' )
> > (Think about it...) In fact, you only need one character not already
> > in the string, plus space...
> >
> > SK
> >
> >
> >
> > >Please explain how? I am pretty sure I understand, and I don't doubt it
> > >(well, after I tested it in the following loop :) but it is very cool.
> > >Also, can you tell me where you got this? Like was it yours? I want to
> > >make sure that I give credit when I mention it to others.
> > >
> > >
> > >
> > > begin
> > > replicate(' ',132),' ')
> > > , replicate(' ', 21),' ')
> > > , replicate(' ', 6),' ')
> > > , replicate(' ', 3),' ')
> > > , replicate(' ', 2),' ')
> > > , replicate(' ', 2),' ') <> '* *'
> > > begin
> > > end
> > >
> > > end
> > >
> > >
> > >
> > >
> > >
> >
>
>
|
|
|
|
|
Louis
|
Posted: Mon Feb 02 13:54:19 CST 2004 |
Top |
SQL Server Developer >> Trying to remove multiple spaces in a string
Very nice. This is the kind of thing that I wish I hadn't slept through
when I was in college. But I did. Luckily enough got through thta I
understand what you are saying :)
--
----------------------------------------------------------------------------
-----------
Compass Technology Management
Pro SQL Server 2000 Database Design
http://www.apress.com/book/bookDisplay.html?bID=266
Note: Please reply to the newsgroups only unless you are
interested in consulting services. All other replies will be ignored :)
> > That is okay, but I don't like the concept of characters that the string
is
> > guaranteed not to have, though it is really neat. The other solution
is
> > technically better, just by a nose.
> >
> > I think this is really cool. I had never really considered how to solve
it
> > using replaces, especially when I can build my own little loop to solve
the
> > problem in a function. The solution that came to mind immediately was a
> > recursive kind of thing like the replace thing that Celko posted, but we
> > have that whole 32 recursive calls limit, so the loop came to mind.
> >
> > The loop function is not all that slow, and I would have probably
favored it
> > over the multiple calls to the replace (' ', ' ') function, just
because it
> > was not guaranteed to work (what if there is one more instance than you
have
> > replaces? Plus, if you put that call in a check constraint, the time to
> > execute is not even measurable using the getdate() function.
>
> If using nested calls to REPLACE, where each function call replaces
> ' ' (2 spaces) with ' ' (1 space), then N such nested calls can replace
> 2^N spaces with 1. Since a VARCHAR value's maximum length is 8000
> characters, which is just under 2^13, a mere 13 calls will handle
> the maximum number of successive spaces. For VARCHAR(N),
> the number of nested REPLACE calls needed is therefore
> CEILING(LOG2(N)). It also puts back into the realm of possibility
> recursive solutions as 32 levels is more than enough. Ya gotta
> love divide-and-conquer.
>
> Regards,
> jag
>
> > --
>
> --------------------------------------------------------------------------
--
> > -----------
> > Compass Technology Management
> >
> > Pro SQL Server 2000 Database Design
> > http://www.apress.com/book/bookDisplay.html?bID=266
> >
> > Note: Please reply to the newsgroups only unless you are
> > interested in consulting services. All other replies will be ignored :)
> >
> > > There have been past discussions about best sequences of replacement
> > > lengths to use. Here's one thread, which includes my favorite
solution
> > > to this question, posted by Robert Carnegie.
> > > (http://groups.google.com/groups?q=carnegie+berlyant+doing)
> > >
> > > Robert's solution:
> > >
> > > If there are characters such as < and > which the string is
> > > guaranteed not to contain,
> > >
> > > replace(
> > > replace(
> > > ' ' , '<>'),
> > > '><', '' ),
> > > '<>', ' ' )
> > > (Think about it...) In fact, you only need one character not
already
> > > in the string, plus space...
> > >
> > > SK
> > >
> > >
> > >
> > > >Please explain how? I am pretty sure I understand, and I don't doubt
it
> > > >(well, after I tested it in the following loop :) but it is very
cool.
> > > >Also, can you tell me where you got this? Like was it yours? I want
to
> > > >make sure that I give credit when I mention it to others.
> > > >
> > > >
> > > >
> > > > begin
> > > > replicate(' ',132),' ')
> > > > , replicate(' ', 21),' ')
> > > > , replicate(' ', 6),' ')
> > > > , replicate(' ', 3),' ')
> > > > , replicate(' ', 2),' ')
> > > > , replicate(' ', 2),' ') <> '* *'
> > > > begin
> > > > end
> > > >
> > > > end
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> >
> >
>
>
|
|
|
|
|
Gert-Jan
|
Posted: Mon Feb 02 14:16:02 CST 2004 |
Top |
SQL Server Developer >> Trying to remove multiple spaces in a string
As Steve noted, this is topic has been discussed before. For me it is
almost a FAQ. During one of the first discussions I thought this up. I
wouldn't be surprised if other people thought of it in different
situations.
The fun is, that many people come up with theories about how many
replaces one would need. If I understood correctly, John Gilson thinks
in this thread it needs 13. In an earlier thread (months ago) someone
had the theory that led to 7 replaces. But as you can see, you only need
6.
The idea is based on the fact that rest spaces are replaced in a later
run. You first replace the biggest block, then replace the smaller
blocks and mob up the resulting spaces from earlier replaces. When a
varchar with up to 7998 consequtive spaces is pulled through a
replace(space(132),space(1)), this can result in (at most) 59+131 = 180
spaces. Pulled through replace(space(21),space(1)), this can result in
(at most) 8+20 = 28 spaces. Etc. etc. The last replace of 2 spaces is to
mob up the final spaces.
The sequence below is one of many sequences that will do the job, and
was developed outside in. In other words, if your varchar is smaller
than 132, you can drop the innermost replace.
Gert-Jan
>
> Please explain how? I am pretty sure I understand, and I don't doubt it
> (well, after I tested it in the following loop :) but it is very cool.
> Also, can you tell me where you got this? Like was it yours? I want to
> make sure that I give credit when I mention it to others.
>
>
>
> begin
> replicate(' ',132),' ')
> , replicate(' ', 21),' ')
> , replicate(' ', 6),' ')
> , replicate(' ', 3),' ')
> , replicate(' ', 2),' ')
> , replicate(' ', 2),' ') <> '* *'
> begin
> end
>
> end
>
> --
> ----------------------------------------------------------------------------
> -----------
> Compass Technology Management
>
> Pro SQL Server 2000 Database Design
> http://www.apress.com/book/bookDisplay.html?bID=266
>
> Note: Please reply to the newsgroups only unless you are
> interested in consulting services. All other replies will be ignored :)
>
> > The statement below will take care of all excess spaces in a varchar
> > column:
> >
> > update tblX
> > set col1 =
> > Replace(
> > Replace(
> > Replace(
> > Replace(
> > Replace(
> > Replace(colX, replicate(' ',132),' ')
> > , replicate(' ', 21),' ')
> > , replicate(' ', 6),' ')
> > , replicate(' ', 3),' ')
> > , replicate(' ', 2),' ')
> > , replicate(' ', 2),' ')
> >
> > Hope this helps,
> > Gert-Jan
> >
> >
> > >
> > > I have some values that I need to convert a varring number of spaces
> > > between words and letters to one space. Is there a way to do this other
> > > than running an update similar to the following until no more rows are
> > > updated?
> > >
> > > Update tblX
> > > Set colX = replace(colX, ' ', ' ')
> > >
> > > Thanks much
> > > Carl
> > >
> > > *** Sent via Developersdex http://www.developersdex.com ***
> > > Don't just participate in USENET...get rewarded for it!
|
|
|
|
|
Louis
|
Posted: Mon Feb 02 15:09:43 CST 2004 |
Top |
SQL Server Developer >> Trying to remove multiple spaces in a string
If there was a prize, and let's be real, there is none, you just won it :)
--
----------------------------------------------------------------------------
-----------
Compass Technology Management
Pro SQL Server 2000 Database Design
http://www.apress.com/book/bookDisplay.html?bID=266
Note: Please reply to the newsgroups only unless you are
interested in consulting services. All other replies will be ignored :)
> As Steve noted, this is topic has been discussed before. For me it is
> almost a FAQ. During one of the first discussions I thought this up. I
> wouldn't be surprised if other people thought of it in different
> situations.
>
> The fun is, that many people come up with theories about how many
> replaces one would need. If I understood correctly, John Gilson thinks
> in this thread it needs 13. In an earlier thread (months ago) someone
> had the theory that led to 7 replaces. But as you can see, you only need
> 6.
>
> The idea is based on the fact that rest spaces are replaced in a later
> run. You first replace the biggest block, then replace the smaller
> blocks and mob up the resulting spaces from earlier replaces. When a
> varchar with up to 7998 consequtive spaces is pulled through a
> replace(space(132),space(1)), this can result in (at most) 59+131 = 180
> spaces. Pulled through replace(space(21),space(1)), this can result in
> (at most) 8+20 = 28 spaces. Etc. etc. The last replace of 2 spaces is to
> mob up the final spaces.
>
> The sequence below is one of many sequences that will do the job, and
> was developed outside in. In other words, if your varchar is smaller
> than 132, you can drop the innermost replace.
>
> Gert-Jan
>
> >
> > Please explain how? I am pretty sure I understand, and I don't doubt it
> > (well, after I tested it in the following loop :) but it is very cool.
> > Also, can you tell me where you got this? Like was it yours? I want to
> > make sure that I give credit when I mention it to others.
> >
> >
> >
> > begin
> > replicate(' ',132),' ')
> > , replicate(' ', 21),' ')
> > , replicate(' ', 6),' ')
> > , replicate(' ', 3),' ')
> > , replicate(' ', 2),' ')
> > , replicate(' ', 2),' ') <> '* *'
> > begin
> > end
> >
> > end
> >
> > --
>
> --------------------------------------------------------------------------
--
> > -----------
> > Compass Technology Management
> >
> > Pro SQL Server 2000 Database Design
> > http://www.apress.com/book/bookDisplay.html?bID=266
> >
> > Note: Please reply to the newsgroups only unless you are
> > interested in consulting services. All other replies will be ignored :)
> >
> > > The statement below will take care of all excess spaces in a varchar
> > > column:
> > >
> > > update tblX
> > > set col1 =
> > > Replace(
> > > Replace(
> > > Replace(
> > > Replace(
> > > Replace(
> > > Replace(colX, replicate(' ',132),' ')
> > > , replicate(' ', 21),' ')
> > > , replicate(' ', 6),' ')
> > > , replicate(' ', 3),' ')
> > > , replicate(' ', 2),' ')
> > > , replicate(' ', 2),' ')
> > >
> > > Hope this helps,
> > > Gert-Jan
> > >
> > >
> > > >
> > > > I have some values that I need to convert a varring number of spaces
> > > > between words and letters to one space. Is there a way to do this
other
> > > > than running an update similar to the following until no more rows
are
> > > > updated?
> > > >
> > > > Update tblX
> > > > Set colX = replace(colX, ' ', ' ')
> > > >
> > > > Thanks much
> > > > Carl
> > > >
> > > > *** Sent via Developersdex http://www.developersdex.com ***
> > > > Don't just participate in USENET...get rewarded for it!
|
|
|
|
|
Steve
|
Posted: Mon Feb 02 16:12:18 CST 2004 |
Top |
SQL Server Developer >> Trying to remove multiple spaces in a string
I don't know what algorithm SQL Server uses for REPLACE, but how about
optimizing the CPU time among sequences of 6 lengths that work...
Here's a script to play with, and a sequence of short numbers that works:
i int unique,
test as '['+space(i)+']'
)
select P3.i*5930+P1.ProductID*77-77+P2.ProductID-1
from
Northwind..Products P1,
Northwind..Products P2,
(select 0 i union all select 1) P3
where P3.i*5930+P1.ProductID*77-77+P2.ProductID-1 <= 7998
select top 1 with ties i, L from (
select
i,len(replace(replace(replace(replace(replace(replace(test,
where L >= 2
order by 2,1
go
SK
>If there was a prize, and let's be real, there is none, you just won it :)
>
>
>
|
|
|
|
|
Delbert
|
Posted: Fri Feb 06 12:08:38 CST 2004 |
Top |
SQL Server Developer >> Trying to remove multiple spaces in a string
Here are (the?) 24 solutions
with fewer then 7 stages
and lengths less than 20.
Solution TotalLength
-------------------- -----------
[19, 17, 6, 3, 2, 2] 49
[19, 17, 6, 3, 3, 2] 50
[19, 17, 6, 4, 2, 2] 50
[19, 17, 6, 4, 3, 2] 51
[19, 17, 7, 3, 2, 2] 50
[19, 17, 7, 3, 3, 2] 51
[19, 17, 7, 4, 2, 2] 51
[19, 17, 7, 4, 3, 2] 52
[19, 18, 6, 3, 2, 2] 50
[19, 18, 6, 3, 3, 2] 51
[19, 18, 6, 4, 2, 2] 51
[19, 18, 6, 4, 3, 2] 52
[19, 18, 7, 3, 2, 2] 51
[19, 18, 7, 3, 3, 2] 52
[19, 18, 7, 4, 2, 2] 52
[19, 18, 7, 4, 3, 2] 53
[19, 19, 6, 3, 2, 2] 51
[19, 19, 6, 3, 3, 2] 52
[19, 19, 6, 4, 2, 2] 52
[19, 19, 6, 4, 3, 2] 53
[19, 19, 7, 3, 2, 2] 52
[19, 19, 7, 3, 3, 2] 53
[19, 19, 7, 4, 2, 2] 53
[19, 19, 7, 4, 3, 2] 54
Bye,
Delbert Glass
|
|
|
|
|
John
|
Posted: Fri Feb 13 01:13:52 CST 2004 |
Top |
SQL Server Developer >> Trying to remove multiple spaces in a string
> As Steve noted, this is topic has been discussed before. For me it is
> almost a FAQ. During one of the first discussions I thought this up. I
> wouldn't be surprised if other people thought of it in different
> situations.
>
> The fun is, that many people come up with theories about how many
> replaces one would need. If I understood correctly, John Gilson thinks
> in this thread it needs 13. In an earlier thread (months ago) someone
> had the theory that led to 7 replaces. But as you can see, you only need
> 6.
Just to be clear, my statement was not a lower bound, but an upper bound,
on how many nested REPLACE calls of 2-to-1 space replacement are
needed for a given size VARCHAR. That is, more than CEILING(LOG2(N))
nested REPLACE calls of 2-to-1 space replacement are never needed
to reduce a VARCHAR(N). Since 2^13 is 8192, a maximum-length
VARCHAR of 8000 characters will require no more than 13 successive
2-to-1 space replacements. This is useful to know if an efficient and correct
answer is needed and a more clever solution is elusive. Clearly, the lower
bound is less than this when replacing more than two spaces at a time.
Let's explore this lower bound. Here's a stored procedure that will
return all sequences of integers that will handle a VARCHAR of a
given length. The procedure can be called to find all sequences,
regardless of length, or simply the shortest sequences.
CREATE TABLE DivisorSequences
(
divisor_sequence VARCHAR(100) NOT NULL,
number_of_divisors INT NOT NULL CHECK (number_of_divisors >= 1),
first_divisor INT NOT NULL CHECK (first_divisor >= 2),
max_dividend INT NOT NULL CHECK (max_dividend >= 2),
PRIMARY KEY (number_of_divisors, divisor_sequence)
)
CREATE VIEW Digits (d)
AS
SELECT 0
UNION ALL
SELECT 1
UNION ALL
SELECT 2
UNION ALL
SELECT 3
UNION ALL
SELECT 4
UNION ALL
SELECT 5
UNION ALL
SELECT 6
UNION ALL
SELECT 7
UNION ALL
SELECT 8
UNION ALL
SELECT 9
CREATE TABLE NonnegativeIntegers
(
n INT NOT NULL PRIMARY KEY CHECK (n >= 0)
)
INSERT INTO NonnegativeIntegers (n)
SELECT Ones.d + 10 * Tens.d + 100 * Hundreds.d + 1000 * Thousands.d +
10000 * TenThousands.d
FROM Digits AS Ones
CROSS JOIN
Digits AS Tens
CROSS JOIN
Digits AS Hundreds
CROSS JOIN
Digits AS Thousands
CROSS JOIN
Digits AS TenThousands
CREATE PROCEDURE InsertDivisorSequences
@target_dividend INT,
@shortest_only CHAR(1) = 'Y' -- 'N' for all sequences regardless of length
AS
BEGIN -- IF
INSERT INTO DivisorSequences
(divisor_sequence, number_of_divisors, first_divisor, max_dividend)
BEGIN -- WHILE
BEGIN -- IF
IF EXISTS (SELECT *
FROM DivisorSequences
BREAK
ELSE
INSERT INTO DivisorSequences
(divisor_sequence, number_of_divisors, first_divisor, max_dividend)
SELECT CAST(N.n AS VARCHAR) + '.' + divisor_sequence,
D.number_of_divisors + 1,
N.n,
(D.max_dividend - N.n + 2) * N.n + N.n - 2
-- unsimplified version of above is
-- ((((D.max_dividend + 1) - (N.n - 1)) * N.n) + (N.n - 1)) - 1
FROM DivisorSequences AS D
INNER JOIN
NonnegativeIntegers AS N
N.n BETWEEN D.first_divisor AND max_dividend + 1
END -- IF
ELSE
BEGIN -- ELSE
INSERT INTO DivisorSequences
(divisor_sequence, number_of_divisors, first_divisor, max_dividend)
SELECT CAST(N.n AS VARCHAR) + '.' + divisor_sequence,
D.number_of_divisors + 1,
N.n,
(D.max_dividend - N.n + 2) * N.n + N.n - 2
FROM DivisorSequences AS D
INNER JOIN
NonnegativeIntegers AS N
N.n BETWEEN D.first_divisor AND max_dividend + 1
END -- ELSE
END -- WHILE
END -- IF
-- Find all sequences to reduce a VARCHAR(10)
EXEC InsertDivisorSequences 10, 'N'
SELECT divisor_sequence, number_of_divisors, max_dividend
FROM DivisorSequences
WHERE max_dividend >= 10
ORDER BY number_of_divisors, divisor_sequence
divisor_sequence number_of_divisors max_dividend
3.2.2 3 10
3.3.2 3 10
4.2.2 3 10
4.3.2 3 10
2.2.2.2 4 16
3.2.2.2 4 22
4.2.2.2 4 26
5.2.2.2 4 28
5.5.2.2 4 28
5.5.3.2 4 28
6.2.2.2 4 28
6.5.2.2 4 28
6.5.3.2 4 28
7.2.2.2 4 26
7.5.2.2 4 26
7.5.3.2 4 26
8.2.2.2 4 22
8.5.2.2 4 22
8.5.3.2 4 22
9.2.2.2 4 16
9.5.2.2 4 16
9.5.3.2 4 16
We see that there are solutions of length 3 and 4 that can reduce a VARCHAR(10).
The divisor_sequence 4.3.2, for example, gives the divisors 4, 3, and 2 in that order.
To see only the shortest sequences
DELETE FROM DivisorSequences
EXEC InsertDivisorSequences 10
For a maximum-length VARCHAR, size 8000, the shortest sequences are indeed
of length 6. And there are over 200,000 to choose from! By the way, finding the
shortest sequences for length 8000 took under 30 seconds to execute on a 1.7 GHz P4
with 512 MB so investigation is not prohibitive.
Regards,
jag
> The idea is based on the fact that rest spaces are replaced in a later
> run. You first replace the biggest block, then replace the smaller
> blocks and mob up the resulting spaces from earlier replaces. When a
> varchar with up to 7998 consequtive spaces is pulled through a
> replace(space(132),space(1)), this can result in (at most) 59+131 = 180
> spaces. Pulled through replace(space(21),space(1)), this can result in
> (at most) 8+20 = 28 spaces. Etc. etc. The last replace of 2 spaces is to
> mob up the final spaces.
>
> The sequence below is one of many sequences that will do the job, and
> was developed outside in. In other words, if your varchar is smaller
> than 132, you can drop the innermost replace.
>
> Gert-Jan
>
> >
> > Please explain how? I am pretty sure I understand, and I don't doubt it
> > (well, after I tested it in the following loop :) but it is very cool.
> > Also, can you tell me where you got this? Like was it yours? I want to
> > make sure that I give credit when I mention it to others.
> >
> >
> >
> > begin
> > replicate(' ',132),' ')
> > , replicate(' ', 21),' ')
> > , replicate(' ', 6),' ')
> > , replicate(' ', 3),' ')
> > , replicate(' ', 2),' ')
> > , replicate(' ', 2),' ') <> '* *'
> > begin
> > end
> >
> > end
> >
> > --
> > ----------------------------------------------------------------------------
> > -----------
> > Compass Technology Management
> >
> > Pro SQL Server 2000 Database Design
> > http://www.apress.com/book/bookDisplay.html?bID=266
> >
> > Note: Please reply to the newsgroups only unless you are
> > interested in consulting services. All other replies will be ignored :)
> >
> > > The statement below will take care of all excess spaces in a varchar
> > > column:
> > >
> > > update tblX
> > > set col1 =
> > > Replace(
> > > Replace(
> > > Replace(
> > > Replace(
> > > Replace(
> > > Replace(colX, replicate(' ',132),' ')
> > > , replicate(' ', 21),' ')
> > > , replicate(' ', 6),' ')
> > > , replicate(' ', 3),' ')
> > > , replicate(' ', 2),' ')
> > > , replicate(' ', 2),' ')
> > >
> > > Hope this helps,
> > > Gert-Jan
> > >
> > >
> > > >
> > > > I have some values that I need to convert a varring number of spaces
> > > > between words and letters to one space. Is there a way to do this other
> > > > than running an update similar to the following until no more rows are
> > > > updated?
> > > >
> > > > Update tblX
> > > > Set colX = replace(colX, ' ', ' ')
> > > >
> > > > Thanks much
> > > > Carl
> > > >
> > > > *** Sent via Developersdex http://www.developersdex.com ***
> > > > Don't just participate in USENET...get rewarded for it!
|
|
|
|
|
Gert-Jan
|
Posted: Fri Feb 13 13:41:38 CST 2004 |
Top |
SQL Server Developer >> Trying to remove multiple spaces in a string
My apologies for not reading your original posting correctly. It was an
interesting analysis.
Gert-Jan
<snip>
> Just to be clear, my statement was not a lower bound, but an upper bound,
> on how many nested REPLACE calls of 2-to-1 space replacement are
> needed for a given size VARCHAR. That is, more than CEILING(LOG2(N))
> nested REPLACE calls of 2-to-1 space replacement are never needed
> to reduce a VARCHAR(N). Since 2^13 is 8192, a maximum-length
> VARCHAR of 8000 characters will require no more than 13 successive
> 2-to-1 space replacements.
<snip>
|
|
|
|
|
John
|
Posted: Sun Feb 15 02:56:36 CST 2004 |
Top |
SQL Server Developer >> Trying to remove multiple spaces in a string
> My apologies for not reading your original posting correctly. It was an
> interesting analysis.
I too found the results interesting. For the case where the value is 8000,
the number of six-number sequences is a whopping 224,112! As you had
mentioned, one of those sequences is 132, 21, 6, 3, 2, 2. Given this sequence,
if the value is < 132 and >= 21, say 25, then the five-number sequence
21, 6, 3, 2, 2 can be applied. However, for 25, it turns out that the
shortest sequences are of length 4, one of them being 6, 3, 2, 2. So
simply paring down a given optimal sequence length for one value
in a minimal way doesn't necessarily given one an optimal sequence
length for another value. Finally, these combinatorial problems are
interesting and challenging to code efficiently in SQL. A naive
implementation can really illustrate the meaning of NP-complete.
Regards,
jag
> Gert-Jan
>
> <snip>
> > Just to be clear, my statement was not a lower bound, but an upper bound,
> > on how many nested REPLACE calls of 2-to-1 space replacement are
> > needed for a given size VARCHAR. That is, more than CEILING(LOG2(N))
> > nested REPLACE calls of 2-to-1 space replacement are never needed
> > to reduce a VARCHAR(N). Since 2^13 is 8192, a maximum-length
> > VARCHAR of 8000 characters will require no more than 13 successive
> > 2-to-1 space replacements.
> <snip>
|
|
|
|
|
|
|