news.only-4-geeks.com Forum Index » C | Goto page 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 Next |
|  | Comment on trim string function please |  | |
| | | swengineer001@gmail.com |  |
| Posted: Thu Jul 10, 2008 4:59 pm Post subject: Comment on trim string function please |  |
Just looking for a few eyes on this code other than my own.
void TrimCString(char *str) { // Trim whitespace from beginning: size_t i = 0; size_t j;
while(isspace(str[i])) { i++; } if(i > 0) { for(j = 0; i < strlen(str); j++, i++) { str[j] = str[i]; } str[j] = '\0'; }
// Trim whitespace from end: i = strlen(str) - 1;
while(isspace(str[i])) { i--; } if(i < (strlen(str) - 1)) { str[i + 1] = '\0'; } } |
| |
| | | Jens Thoms Toerring |  |
| Posted: Thu Jul 10, 2008 4:59 pm Post subject: Re: Comment on trim string function please |  |
| |  | |
swengineer001@gmail.com <swengineer001@gmail.com> wrote:
| Quote: | Just looking for a few eyes on this code other than my own.
void TrimCString(char *str) { // Trim whitespace from beginning:
|
I guess I would trim from the end first since then you have less copying to do afterwards.
| Quote: | size_t i = 0; size_t j;
while(isspace(str[i]))
|
isspace() expects an int and not a char as it's argument.
| Quote: | { i++; } if(i > 0) { for(j = 0; i < strlen(str); j++, i++) { str[j] = str[i]; } str[j] = '\0'; }
|
An alternative would be to use memmove() here, so you don't have to do it byte by byte. Also callling strlen() each time through the loop is a bit of a waste of time - it doesn't change and can be replaced by a check if str[i] is '\0'.
| Quote: | // Trim whitespace from end: i = strlen(str) - 1;
|
Careful: This could set 'i' to -1 (if the string consistet of white space only) and then the rest won't work anymore.
| Quote: | while(isspace(str[i])) { i--; }
if(i < (strlen(str) - 1)) { str[i + 1] = '\0'; } }
|
Here's an alternative version using pointers (and trying to minimize the number of calls of strlen()):
void TrimCString( char *str ) { char *p, *q;
/* Check that we've got something that looks like a string */
if ( ! str || ! * str ) return;
/* Trim from end */
for ( p = str + strlen( str ) - 1; p != str && isspace( ( int ) *p ); p-- ) /* empty */ ;
if ( p == str ) /* only white space in string */ { *str = '\0'; return; }
*++p = '\0';
/* Trim from start */
for ( q = str; isspace( ( int ) *q ); q++ ) /* empty */ ;
if ( q != str ) memmove( str, q, p - q + 1 ); } Regards, Jens -- \ Jens Thoms Toerring ___ jt@toerring.de \__________________________ LINK |
| |
| | | Eric Sosman |  |
| Posted: Thu Jul 10, 2008 4:59 pm Post subject: Re: Comment on trim string function please |  |
| |  | |
swengineer001@gmail.com wrote:
| Quote: | Just looking for a few eyes on this code other than my own.
|
This pair of eyes sees three bugs, two occurring twice each and the other perhaps due to too much snippage. There are also some opportunities to improve speed and style. So, here we go:
Bug: Since you're using isspace() and strlen(), you need to #include <ctype.h> and <string.h> to get their declarations. Without the declarations, a compiler operating under C99 rules must generate a diagnostic. Under C89 rules the code will work, but might not be as fast as if the vendor's "magic" declarations were present.
| Quote: | void TrimCString(char *str) { // Trim whitespace from beginning: size_t i = 0;
|
Style: Instead of initializing `i' at its declaration and then relying on the initial value later on, consider initializing it closer to its use. The `for' statement is convenient for such purposes.
| Quote: | size_t j;
while(isspace(str[i]))
|
Bug: If `str' contains negative-valued characters, this use may violate the "contract" of isspace() by passing an out-of-range argument. Use `while (isspace( (unsigned char)str[i] ))'. (This is one of those occasions where a cast is almost always required and almost always omitted, as opposed to the more usual situation where a cast is almost always inserted and almost always wrong.)
| Quote: | { i++; } if(i > 0) { for(j = 0; i < strlen(str); j++, i++)
|
Speed: This loop calculates strlen(str) on every iteration. Since it will return the same value each time, all calls after the first are wasted effort. Call strlen() once before the loop and store the result in a variable, and use that variable to control the loop.
| Quote: | { str[j] = str[i]; } str[j] = '\0'; }
// Trim whitespace from end: i = strlen(str) - 1;
|
Bug: If `str' is the empty string (either because it was empty to begin with or because it contained only white space), this calculation will produce a very large `i' that is almost assuredly wrong, R-O-N-G.
| Quote: | while(isspace(str[i]))
|
Bug: Same missing cast as above.
| Quote: | { i--; } if(i < (strlen(str) - 1))
|
Bug: Same mishandling of the empty string as above.
Speed: strlen(str) is still the same as it was a few lines ago, so there's no point in computing it again.
| Quote: | { str[i + 1] = '\0';
|
Speed: It's probably quicker -- and certainly less verbose -- to do this assignment unconditionally than to test whether it's needed. If it isn't, you'll just store a zero on top of an existing zero, which is harmless.
Summary: Not quite ready for prime time, but not as bad as some attempts I've seen.
Challenge: See if you can think of a way to do the job in just one pass over the string (calling strlen() counts as a "pass"). Hint: During the copy-to-front operation, can you somehow figure out where the final '\0' should land without going back and inspecting the moved characters a second time?
-- Eric.Sosman@sun.com |
| |
| | | Ben Bacarisse |  |
| Posted: Thu Jul 10, 2008 4:59 pm Post subject: Re: Comment on trim string function please |  |
Eric Sosman <Eric.Sosman@sun.com> writes:
| Quote: | swengineer001@gmail.com wrote: Just looking for a few eyes on this code other than my own. snip size_t i = 0;
Style: Instead of initializing `i' at its declaration and then relying on the initial value later on, consider initializing it closer to its use. The `for' statement is convenient for such purposes.
|
If you are suggesting replacing the while (isspace(str[i]) with a for (int i = 0; isspace((unsigned char)str[i]; i++); then this will not allow i to tested outside the for:
| Quote: | while(isspace(str[i])) { i++; } if(i > 0)
|
here.
-- Ben. |
| |
| | | Michael Brennan |  |
| Posted: Thu Jul 10, 2008 5:04 pm Post subject: Re: Comment on trim string function please |  |
On 2008-07-10, swengineer001@gmail.com <swengineer001@gmail.com> wrote:
| Quote: | Here is my second go at it after reading the comments provided.
|
<snip>
| Quote: | // Trim whitespace from end: //Added check to catch the empty string i = (strlen(str) ? (strlen(str) - 1) : 0);
while(isspace((unsigned char)str[i])) { i--; } //Removed check that was not needed str[i + 1] = '\0'; }
|
2 things I found:
1. You are missing a return statement at the end.
2. This will not be correct if someone calls your function with an array of length one (with its only element set to '\0'). The index in your assignment at the last line will then be 1.
-- Michael Brennan |
| |
| | | Eric Sosman |  |
| Posted: Thu Jul 10, 2008 5:23 pm Post subject: Re: Comment on trim string function please |  |
| |  | |
swengineer001@gmail.com wrote:
| Quote: | Here is my second go at it after reading the comments provided.
//This was in my original file but just not immediately before this function #include <ctype.h #include <string.h
//Added char* return as suggested char* TrimCString(char *str) { // Trim whitespace from beginning: size_t i = 0; size_t j;
//Added validation of the argument for NULL if(str == NULL) { return str; }
//Added cast as suggested. I have always avoided casts for various reasons //discussed in the FAQ and elsewhere but it is good to know this case while(isspace((unsigned char)str[i])) { i++; } if(i > 0) { //Calculate length once size_t length = strlen(str); //Decided to leave the bytewise copy I just think I can understand it a little better for(j = 0; i < length; j++, i++) { str[j] = str[i]; } str[j] = '\0'; }
|
Why move the characters at all? The original version needed to because it returned no value, so the only way it could report the position of the first non-space was to squeeze out the leading spaces. But now that you're returning a pointer, why not just point at the first non-space no matter where you find it?
| Quote: | // Trim whitespace from end: //Added check to catch the empty string i = (strlen(str) ? (strlen(str) - 1) : 0);
while(isspace((unsigned char)str[i])) { i--; }
|
This still mis-handles the empty string. Work through it: You'll set `i' to zero, and then you'll test whether `str[0]' is a space. It's the '\0', so it's not a space, so you decrement `i' from zero to a very large number. Then you check whether `str[very_large_number]' is a space, and there's no telling what will happen except that it's not likely to be good ...
| Quote: | //Removed check that was not needed str[i + 1] = '\0'; }
|
You still haven't picked up on the hints about an easier way to deal with the trailing spaces. Let's try another analogy: Imagine you're watching a football ("soccer") game that has reached the shoot-out tie-breaking phase. Abelard shoots and the goalkeeper makes the save (space). Then Bertrand shoots and scores (non-space). Then Claude shoots, and Daniel, and Edouard, ... When the match is over, you know you will be asked which shooter scored the side's final goal (which may or may not have been followed by a few saves). Unfortunately, your memory is terrible -- but you have an erasable slate with enough space to write one name at a time. Can you think of a method that will ensure you can answer the question correctly?
-- Eric.Sosman@sun.com |
| |
| | | Guest |  |
| Posted: Thu Jul 10, 2008 5:39 pm Post subject: Re: Comment on trim string function please |  |
| |  | |
swengineer001@gmail.com wrote:
| Quote: | Just looking for a few eyes on this code other than my own.
void TrimCString(char *str)
|
Why not return a char *, like most other string functions?
char *TrimCString(char *str)
using char * enables you to piggyback your function in the middle of other functions, eg printf("%s\n", TrimCString(someString));
| Quote: | { // Trim whitespace from beginning: size_t i = 0; size_t j;
while(isspace(str[i])) { i++; } if(i > 0) { for(j = 0; i < strlen(str); j++, i++)
|
move the strlen() outside the loop. Maybe use memmove() instead of the loop.
| Quote: | { str[j] = str[i]; } str[j] = '\0'; }
// Trim whitespace from end: i = strlen(str) - 1;
|
Use the strlen you computed before, when you moved it out of the for loop above :)
| Quote: | while(isspace(str[i])) { i--; } if(i < (strlen(str) - 1)) { str[i + 1] = '\0';
|
No need for the test. when i >= (strlen(str) - 1) -- it can only be equal, anyway -- the assignment overwrites a '\0' with a brand new '\0'. Anyway, if you want to keep the test, use the computed strlen.
A couple what-if's
* what if a pass NULL to the function? TrimCString(NULL);
* what if I pass a constant string literal to the function? TrimCString(" 4 spaces at both ends "); |
| |
| | | swengineer001@gmail.com |  |
| Posted: Thu Jul 10, 2008 5:56 pm Post subject: Re: Comment on trim string function please |  |
| |  | |
On Jul 10, 1:32 pm, j...@toerring.de (Jens Thoms Toerring) wrote:
| Quote: | swengineer...@gmail.com <swengineer...@gmail.com> wrote: Just looking for a few eyes on this code other than my own. void TrimCString(char *str) { // Trim whitespace from beginning:
I guess I would trim from the end first since then you have less copying to do afterwards.
size_t i = 0; size_t j; while(isspace(str[i]))
isspace() expects an int and not a char as it's argument. Doesn't this promotion happen implicitly and without loss of |
information since I am actually dealing with characters and not using the char as a holder for small integers?
| Quote: | { i++; } if(i > 0) { for(j = 0; i < strlen(str); j++, i++) { str[j] = str[i]; } str[j] = '\0'; }
An alternative would be to use memmove() here, so you don't have to do it byte by byte. Also callling strlen() each time through the loop is a bit of a waste of time - it doesn't change and can be replaced by a check if str[i] is '\0'.
// Trim whitespace from end: i = strlen(str) - 1;
Careful: This could set 'i' to -1 (if the string consistet of white space only) and then the rest won't work anymore. i is of type size_t which I believe can't be negative?
while(isspace(str[i])) { i--; } if(i < (strlen(str) - 1)) { str[i + 1] = '\0'; } }
Here's an alternative version using pointers (and trying to minimize the number of calls of strlen()):
void TrimCString( char *str ) { char *p, *q;
/* Check that we've got something that looks like a string */
if ( ! str || ! * str ) return;
/* Trim from end */
for ( p = str + strlen( str ) - 1; p != str && isspace( ( int ) *p ); p-- ) /* empty */ ;
if ( p == str ) /* only white space in string */ { *str = '\0'; return; }
*++p = '\0';
/* Trim from start */
for ( q = str; isspace( ( int ) *q ); q++ ) /* empty */ ;
if ( q != str ) memmove( str, q, p - q + 1 );} Thanks for the code and the comments. This is useful. |
|
| |
| | | swengineer001@gmail.com |  |
| Posted: Thu Jul 10, 2008 6:02 pm Post subject: Re: Comment on trim string function please |  |
| |  | |
On Jul 10, 1:39 pm, badc0...@gmail.com wrote:
| Quote: | swengineer...@gmail.com wrote: Just looking for a few eyes on this code other than my own.
void TrimCString(char *str)
Why not return a char *, like most other string functions?
char *TrimCString(char *str)
using char * enables you to piggyback your function in the middle of other functions, eg printf("%s\n", TrimCString(someString)); Good point, I had not thought of it.
{ // Trim whitespace from beginning: size_t i = 0; size_t j;
while(isspace(str[i])) { i++; } if(i > 0) { for(j = 0; i < strlen(str); j++, i++)
move the strlen() outside the loop. Maybe use memmove() instead of the loop.
{ str[j] = str[i]; } str[j] = '\0'; }
// Trim whitespace from end: i = strlen(str) - 1;
Use the strlen you computed before, when you moved it out of the for loop above  the length of the string has changed since then, possibly.
while(isspace(str[i])) { i--; } if(i < (strlen(str) - 1)) { str[i + 1] = '\0';
No need for the test. when i >= (strlen(str) - 1) -- it can only be equal, anyway -- the assignment overwrites a '\0' with a brand new '\0'. Anyway, if you want to keep the test, use the computed strlen.
} }
A couple what-if's
* what if a pass NULL to the function? TrimCString(NULL); I added a check for this.
* what if I pass a constant string literal to the function? TrimCString(" 4 spaces at both ends "); How can I account for this? |
|
| |
| | | swengineer001@gmail.com |  |
| Posted: Thu Jul 10, 2008 6:07 pm Post subject: Re: Comment on trim string function please |  |
| |  | |
On Jul 10, 1:41 pm, Eric Sosman <Eric.Sos...@sun.com> wrote:
| Quote: | swengineer...@gmail.com wrote: Just looking for a few eyes on this code other than my own.
This pair of eyes sees three bugs, two occurring twice each and the other perhaps due to too much snippage. There are also some opportunities to improve speed and style. So, here we go:
Bug: Since you're using isspace() and strlen(), you need to #include <ctype.h> and <string.h> to get their declarations. Without the declarations, a compiler operating under C99 rules must generate a diagnostic. Under C89 rules the code will work, but might not be as fast as if the vendor's "magic" declarations were present.
void TrimCString(char *str) { // Trim whitespace from beginning: size_t i = 0;
Style: Instead of initializing `i' at its declaration and then relying on the initial value later on, consider initializing it closer to its use. The `for' statement is convenient for such purposes.
size_t j;
while(isspace(str[i]))
Bug: If `str' contains negative-valued characters, this use may violate the "contract" of isspace() by passing an out-of-range argument. Use `while (isspace( (unsigned char)str[i] ))'. (This is one of those occasions where a cast is almost always required and almost always omitted, as opposed to the more usual situation where a cast is almost always inserted and almost always wrong.)
{ i++; } if(i > 0) { for(j = 0; i < strlen(str); j++, i++)
Speed: This loop calculates strlen(str) on every iteration. Since it will return the same value each time, all calls after the first are wasted effort. Call strlen() once before the loop and store the result in a variable, and use that variable to control the loop.
{ str[j] = str[i]; } str[j] = '\0'; }
// Trim whitespace from end: i = strlen(str) - 1;
Bug: If `str' is the empty string (either because it was empty to begin with or because it contained only white space), this calculation will produce a very large `i' that is almost assuredly wrong, R-O-N-G.
while(isspace(str[i]))
Bug: Same missing cast as above.
{ i--; } if(i < (strlen(str) - 1))
Bug: Same mishandling of the empty string as above.
Speed: strlen(str) is still the same as it was a few lines ago, so there's no point in computing it again.
{ str[i + 1] = '\0';
Speed: It's probably quicker -- and certainly less verbose -- to do this assignment unconditionally than to test whether it's needed. If it isn't, you'll just store a zero on top of an existing zero, which is harmless.
} }
Summary: Not quite ready for prime time, but not as bad as some attempts I've seen.
Challenge: See if you can think of a way to do the job in just one pass over the string (calling strlen() counts as a "pass"). Hint: During the copy-to-front operation, can you somehow figure out where the final '\0' should land without going back and inspecting the moved characters a second time?
-- Eric.Sos...@sun.com
|
Thanks for the input. Someone else had pointed out one of the issues you mentioned but I didn't understsnd what he was saying until I read your description. |
| |
| Page 1 of 14 .:. Goto page 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 Next | |
|
|