Sorry this post is not directly umbraco related but this is the most helpful community I am a part of! I have a CSV file that contains fields with embedded commas i.e. 1,4,55,"aaron,williams",1
What is the easiest way for me to remove the comma in "aaron,williams"? It will always be surrounded by the "" quotes.
what Doug suggest should work just fine for the one-off replace. However, if you're looking for a programatic way to parse he file at some interval, the following class might help.
public class StringFunctions
{
public static string[] Split(string expression, char delimiter, char qualifier, bool ignoreCase)
{
expression = expression.Trim();
if (ignoreCase)
{
expression = expression.ToLower();
delimiter = char.ToLower(delimiter);
qualifier = char.ToLower(qualifier);
}
int len = expression.Length;
List<string> list = new List<string>();
int begField, endField; // text cursors
for (begField = endField = 0; endField < len; begField = endField)
{
char s = expression[endField];
bool entityContainsQualifiers = false;
// move to the delimiter
while (s != delimiter)
{
if (s != qualifier)
{
// consume and continue if possible
++endField;
if (len <= endField) { break; }
else { s = expression[endField]; continue; }
}
#region Consume Text Within Two Qualifiers
// we have the qualifier symbol
// then move to the closing one
entityContainsQualifiers = true;
bool foundClosingQualifier = false;
for (endField = endField + 1; endField < len; ++endField)
{
s = expression[endField];
if (endField + 1 < len)
{
if (s == qualifier && expression[endField + 1] == delimiter)
{
foundClosingQualifier = true;
break;
}
}
else
{
if (s == qualifier)
{
foundClosingQualifier = true;
break;
}
}
}
if (false == foundClosingQualifier)
{
throw new ArgumentException
("expression contains an unclosed qualifier symbol");
}
// consume the closing quantifier and continue if possible
++endField;
if (len <= endField) { break; }
else { s = expression[endField]; continue; }
#endregion
}//while (s != delimiter)
// all what is in between begField and endField cursors is the entity...
string entity = expression.Substring(begField, endField - begField);
if (entityContainsQualifiers)
{
entity = entity.Replace(new string(qualifier, 1), "");
}
list.Add(entity);
// two possibilities:
// 1) we have found the delimiter
// 2) we have came to the end of the expression
// possibility (1)
if (s == delimiter)
{
// consume and continue if possible
++endField;
if (len <= endField)
{
// this delimiter is the last symbol of the expression
// we should add the empty string as the last entity
// and leave
list.Add(string.Empty);
break;
}
else
{
// there are more entities in the expression
// proceed with collecting the entities
// note: s - initialization is done at the begining of the main cycle
continue;
}
}
else // possibility (2)
{
// leave the cycle
break;
}
}
return list.ToArray();
}
}
I didn't write this code (and unfortunately don't remember where this is from...) but I used this a while back for parsing CSVs and if I recall it worked great.
Parsing CSV files with embedded commas
Hi,
Sorry this post is not directly umbraco related but this is the most helpful community I am a part of! I have a CSV file that contains fields with embedded commas i.e. 1,4,55,"aaron,williams",1
What is the easiest way for me to remove the comma in "aaron,williams"? It will always be surrounded by the "" quotes.
Thanks
Haha... we sure try to be friendly and helpful.
And even though this is not really an umbraco question at all...
I'd open the CSV in Visual Studio and use a regex search-replace for commas inside double quotes.
The regex would be something like this:
FIND: "{.*},{.*}"
REPLACE: "\1 \2"
cheers,
doug.
Hi,
In UmbImport I'm using this custom CSV Parser. It's taking care of these issues.
Hope it helps you,
Richard
Hi,
what Doug suggest should work just fine for the one-off replace. However, if you're looking for a programatic way to parse he file at some interval, the following class might help.
I didn't write this code (and unfortunately don't remember where this is from...) but I used this a while back for parsing CSVs and if I recall it worked great.
HTH,
Nik
is working on a reply...