CSV parsing/writing?

classic Classic list List threaded Threaded
49 messages Options
123
Reply | Threaded
Open this post in threaded view
|

CSV parsing/writing?

Don Seiler
Afternoon.  Just writing to ask if anyone knows of any commons/jakarta
packages that may do CSV parsing and writing.  I'm aware of the jcsv
package but thought I would try and utilize commons as much as possible.
I looked at jakarta-oro as well but don't seem to see anything CSV
related.

Thanks in advance.
--
Don Seiler
[hidden email]

Public Key: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xFC87F041
Fingerprint: 0B56 50D5 E91E 4D4C 83B7  207C 76AC 5DA2 FC87 F041

attachment0 (196 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: CSV parsing/writing?

Frank W. Zammetti
I might be missing something, but doesn't StringTokenizer do the trick
for you?

Don Seiler wrote:
> Afternoon.  Just writing to ask if anyone knows of any commons/jakarta
> packages that may do CSV parsing and writing.  I'm aware of the jcsv
> package but thought I would try and utilize commons as much as possible.
> I looked at jakarta-oro as well but don't seem to see anything CSV
> related.
>
> Thanks in advance.

--
Frank W. Zammetti
Founder and Chief Software Architect
Omnytex Technologies
http://www.omnytex.com


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: CSV parsing/writing?

Don Seiler
On 17:46 Wed 25 May     , Frank W. Zammetti wrote:
> I might be missing something, but doesn't StringTokenizer do the trick
> for you?

Anyone with experience parsing CSVs knows there are the cases of
delimiters within quotes that make the parsing a bigger headache than
just using StringTokenizer (or String.split()).  Why else would there be
so many other third-party APIs for it?

> Don Seiler wrote:
> >Afternoon.  Just writing to ask if anyone knows of any commons/jakarta
> >packages that may do CSV parsing and writing.  I'm aware of the jcsv
> >package but thought I would try and utilize commons as much as possible.
> >I looked at jakarta-oro as well but don't seem to see anything CSV
> >related.
> >
> >Thanks in advance.

--
Don Seiler
[hidden email]

Public Key: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xFC87F041
Fingerprint: 0B56 50D5 E91E 4D4C 83B7  207C 76AC 5DA2 FC87 F041

attachment0 (196 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: CSV parsing/writing?

Paul deCoursey
In reply to this post by Frank W. Zammetti
Yes you are missing something, escaped commas and Quoted fields.  I  don't
know of any part of commons that parses it.

pd

On May 25, 2005, at 4:46 PM, Frank W. Zammetti wrote:

> I might be missing something, but doesn't StringTokenizer do the trick
for you?
> Don Seiler wrote:
>> Afternoon.  Just writing to ask if anyone knows of any commons/jakarta
packages that may do CSV parsing and writing.  I'm aware of the jcsv
package but thought I would try and utilize commons as much as
possible.
>> I looked at jakarta-oro as well but don't seem to see anything CSV
related.
>> Thanks in advance.
> --
> Frank W. Zammetti
> Founder and Chief Software Architect
> Omnytex Technologies
> http://www.omnytex.com
> --------------------------------------------------------------------- To
unsubscribe, e-mail: [hidden email] For
additional commands, e-mail: [hidden email]






---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: CSV parsing/writing?

Frank W. Zammetti
In reply to this post by Don Seiler
Fair enough.  I have parsed CSVs a number of times, I guess I've been
lucky in that one of the design criteria was no occurances of the
delimiter within data elements.  Certainly if there is a chance of that,
then sure, you need something more advanced.

Frank


Don Seiler wrote:

> On 17:46 Wed 25 May     , Frank W. Zammetti wrote:
>
>>I might be missing something, but doesn't StringTokenizer do the trick
>>for you?
>
>
> Anyone with experience parsing CSVs knows there are the cases of
> delimiters within quotes that make the parsing a bigger headache than
> just using StringTokenizer (or String.split()).  Why else would there be
> so many other third-party APIs for it?
>
>
>>Don Seiler wrote:
>>
>>>Afternoon.  Just writing to ask if anyone knows of any commons/jakarta
>>>packages that may do CSV parsing and writing.  I'm aware of the jcsv
>>>package but thought I would try and utilize commons as much as possible.
>>>I looked at jakarta-oro as well but don't seem to see anything CSV
>>>related.
>>>
>>>Thanks in advance.
>
>

--
Frank W. Zammetti
Founder and Chief Software Architect
Omnytex Technologies
http://www.omnytex.com


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: CSV parsing/writing?

James Sangster
I was looking to doing CSV parsing using regular expressions, but I came
across one post in a newsgroup where it was stated that regular expressions
themselves couldn't handle it alone.   Because the environment I was working
with had restricted regular expression capabilities and no third party
package integration capabilities, I instead just went for the brute force
method of parsing character for character on each line and using a state
machine.

It seems to work very well, but the performance could be a little better.

james



-----Original Message-----
From: Frank W. Zammetti [mailto:[hidden email]]
Sent: Wednesday, May 25, 2005 7:01 PM
To: Don Seiler
Cc: Jakarta Commons Users List
Subject: Re: CSV parsing/writing?


Fair enough.  I have parsed CSVs a number of times, I guess I've been
lucky in that one of the design criteria was no occurances of the
delimiter within data elements.  Certainly if there is a chance of that,
then sure, you need something more advanced.

Frank


Don Seiler wrote:

> On 17:46 Wed 25 May     , Frank W. Zammetti wrote:
>
>>I might be missing something, but doesn't StringTokenizer do the trick
>>for you?
>
>
> Anyone with experience parsing CSVs knows there are the cases of
> delimiters within quotes that make the parsing a bigger headache than
> just using StringTokenizer (or String.split()).  Why else would there
> be so many other third-party APIs for it?
>
>
>>Don Seiler wrote:
>>
>>>Afternoon.  Just writing to ask if anyone knows of any
>>>commons/jakarta packages that may do CSV parsing and writing.  I'm
>>>aware of the jcsv package but thought I would try and utilize commons
>>>as much as possible. I looked at jakarta-oro as well but don't seem
>>>to see anything CSV related.
>>>
>>>Thanks in advance.
>
>

--
Frank W. Zammetti
Founder and Chief Software Architect
Omnytex Technologies
http://www.omnytex.com


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: CSV parsing/writing?

Don Seiler
On 19:05 Wed 25 May     , James Sangster wrote:
> I was looking to doing CSV parsing using regular expressions, but I came
> across one post in a newsgroup where it was stated that regular expressions
> themselves couldn't handle it alone.   Because the environment I was working
> with had restricted regular expression capabilities and no third party
> package integration capabilities, I instead just went for the brute force
> method of parsing character for character on each line and using a state
> machine.
>
> It seems to work very well, but the performance could be a little better.

Would the jakarta community welcome a CSV parsing/writing module for
commons?  I'd be happy to work on it, no doubt I would start down a
similar path of having to look at each character and track the state of
what is a field and what isn't.

--
Don Seiler
[hidden email]

Public Key: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xFC87F041
Fingerprint: 0B56 50D5 E91E 4D4C 83B7  207C 76AC 5DA2 FC87 F041

attachment0 (196 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: CSV parsing/writing?

Martin Cooper
On 5/25/05, Don Seiler <[hidden email]> wrote:

> On 19:05 Wed 25 May     , James Sangster wrote:
> > I was looking to doing CSV parsing using regular expressions, but I came
> > across one post in a newsgroup where it was stated that regular expressions
> > themselves couldn't handle it alone.   Because the environment I was working
> > with had restricted regular expression capabilities and no third party
> > package integration capabilities, I instead just went for the brute force
> > method of parsing character for character on each line and using a state
> > machine.
> >
> > It seems to work very well, but the performance could be a little better.
>
> Would the jakarta community welcome a CSV parsing/writing module for
> commons?  I'd be happy to work on it, no doubt I would start down a
> similar path of having to look at each character and track the state of
> what is a field and what isn't.

I'd be happy to see such a thing here in Commons. However, it would be
hard to believe that there isn't already such a thing in some Jakarta
or other ASF Java project that we could bring here, instead of writing
one from scratch.

--
Martin Cooper


> --
> Don Seiler
> [hidden email]
>
> Public Key: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xFC87F041
> Fingerprint: 0B56 50D5 E91E 4D4C 83B7  207C 76AC 5DA2 FC87 F041
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: CSV parsing/writing?

Don Seiler
On 21:33 Wed 25 May     , Martin Cooper wrote:

> On 5/25/05, Don Seiler <[hidden email]> wrote:
> > Would the jakarta community welcome a CSV parsing/writing module for
> > commons?  I'd be happy to work on it, no doubt I would start down a
> > similar path of having to look at each character and track the state of
> > what is a field and what isn't.
>
> I'd be happy to see such a thing here in Commons. However, it would be
> hard to believe that there isn't already such a thing in some Jakarta
> or other ASF Java project that we could bring here, instead of writing
> one from scratch.
I was thinking the same thing.  As I said, I looked around in
jakarta-oro and the other commons packages.  From the available
descriptions I didn't see anything CSV-related.

--
Don Seiler
[hidden email]

Public Key: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xFC87F041
Fingerprint: 0B56 50D5 E91E 4D4C 83B7  207C 76AC 5DA2 FC87 F041

attachment0 (196 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: CSV parsing/writing?

Simon Kitching
In reply to this post by Don Seiler
On Wed, 2005-05-25 at 23:24 -0500, Don Seiler wrote:

> On 19:05 Wed 25 May     , James Sangster wrote:
> > I was looking to doing CSV parsing using regular expressions, but I came
> > across one post in a newsgroup where it was stated that regular expressions
> > themselves couldn't handle it alone.   Because the environment I was working
> > with had restricted regular expression capabilities and no third party
> > package integration capabilities, I instead just went for the brute force
> > method of parsing character for character on each line and using a state
> > machine.
> >
> > It seems to work very well, but the performance could be a little better.
>
> Would the jakarta community welcome a CSV parsing/writing module for
> commons?  I'd be happy to work on it, no doubt I would start down a
> similar path of having to look at each character and track the state of
> what is a field and what isn't.
>

There was a thread on this topic almost exactly two years ago, with
subject "[SURVEY] Commons-csv or not?":
  http://tinyurl.com/bojgz

Regards,

Simon


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: CSV parsing/writing?

Don Seiler
On 16:40 Thu 26 May     , Simon Kitching wrote:
> There was a thread on this topic almost exactly two years ago, with
> subject "[SURVEY] Commons-csv or not?":
>   http://tinyurl.com/bojgz

Sounds like a good conversation, but it seemed to suddenly die with no
action.  As I said, I'd be happy to contribute at least a brute-force
parser to begin with for commons-io or whatever the jakarta gods deem
appropriate (IO makes the most sense to me, but I'm new here).

And, in my mind, CSV is not just "comma separated," so I would support
user-specified delimiters and field qualifiers (defaulting to comma and
double-quotes, respectively).

--
Don Seiler
[hidden email]

Public Key: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xFC87F041
Fingerprint: 0B56 50D5 E91E 4D4C 83B7  207C 76AC 5DA2 FC87 F041

attachment0 (196 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: CSV parsing/writing?

Catalin Grigoroscuta
In reply to this post by Don Seiler
No need to re-invent the wheel, try ostermiller CSV parser (see
ostermiller.org) - open  source, GPL licence.
It works fine for me.

Don Seiler wrote:

>Afternoon.  Just writing to ask if anyone knows of any commons/jakarta
>packages that may do CSV parsing and writing.  I'm aware of the jcsv
>package but thought I would try and utilize commons as much as possible.
>I looked at jakarta-oro as well but don't seem to see anything CSV
>related.
>
>Thanks in advance.
>  
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: CSV parsing/writing?

Martin Cooper
On 5/25/05, Catalin Grigoroscuta <[hidden email]> wrote:
> No need to re-invent the wheel, try ostermiller CSV parser (see
> ostermiller.org) - open  source, GPL licence.
> It works fine for me.

A GPL license might be fine for people who want to pick up this
package and include it in their applications. However, the GPL is
fundamentally incompatible with the ASL, so it's not something we
could pick up and include in any Jakarta Commons component.

--
Martin Cooper


>
> Don Seiler wrote:
>
> >Afternoon.  Just writing to ask if anyone knows of any commons/jakarta
> >packages that may do CSV parsing and writing.  I'm aware of the jcsv
> >package but thought I would try and utilize commons as much as possible.
> >I looked at jakarta-oro as well but don't seem to see anything CSV
> >related.
> >
> >Thanks in advance.
> >
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: CSV parsing/writing?

Chetan Sahasrabudhe
In reply to this post by Don Seiler
I guess CSV commons is a very good suggestion.
This kind of thing is very much required in data staging and processing.
If java needs to work with commercial products to process huge amount of data
then I would suggest to start the initiative on developing something for CSV processing.

On initial thought I can think of following feature list.

CSV Read
1. configurable column selection.
2. Hibernate / struts property driven CSV read configuration. (Here I am talking about referencing third party xml elements as target references.)
3. xsl driven CSV conversions (CSV to XML, CSV to HTML, CSV to EDI, CSV to *new format*)
4. CSVFilter as that for FileFilter
   column range, column width range, row range


Regards
Chetan



-----Original Message-----
From: Martin Cooper [mailto:[hidden email]]
Sent: Thursday, May 26, 2005 11:53 AM
To: [hidden email]; Jakarta Commons Users List
Subject: Re: CSV parsing/writing?


On 5/25/05, Catalin Grigoroscuta <[hidden email]> wrote:
> No need to re-invent the wheel, try ostermiller CSV parser (see
> ostermiller.org) - open  source, GPL licence.
> It works fine for me.

A GPL license might be fine for people who want to pick up this
package and include it in their applications. However, the GPL is
fundamentally incompatible with the ASL, so it's not something we
could pick up and include in any Jakarta Commons component.

--
Martin Cooper


>
> Don Seiler wrote:
>
> >Afternoon.  Just writing to ask if anyone knows of any commons/jakarta
> >packages that may do CSV parsing and writing.  I'm aware of the jcsv
> >package but thought I would try and utilize commons as much as possible.
> >I looked at jakarta-oro as well but don't seem to see anything CSV
> >related.
> >
> >Thanks in advance.
> >
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]



---------------------------------
This message contains the information that may be privileged and is  the property of the KPIT Cummins Infosystems LTD.It is intended only for the person to whom it is addressed. If you are not intended recipient, you are not authorized to read, print , retain copy, disseminate, distribute, or use this message or any part thereof. If you receive this message in error, please notify the sender immediately and delete all copies of this message. KPIT Cummins does not accept any liability for virus infected mails.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: CSV parsing/writing?

Catalin Grigoroscuta
Hi,

The features you describe here are very nice indeed, but I think that
most people would prefer a simple but fully functional CSV reader and
writer that would be finished like yesterday, and a future version with
hibernate/struts/xsl/EDI/whatever.

I would definitely vote for a quick functional implementation (to be
ready in at most one week), and then let the users decide what features
they mostly need.

Cheers,
Catalin

Chetan Sahasrabudhe wrote:

>I guess CSV commons is a very good suggestion.
>This kind of thing is very much required in data staging and processing.
>If java needs to work with commercial products to process huge amount of data
>then I would suggest to start the initiative on developing something for CSV processing.
>
>On initial thought I can think of following feature list.
>
>CSV Read
>1. configurable column selection.
>2. Hibernate / struts property driven CSV read configuration. (Here I am talking about referencing third party xml elements as target references.)
>3. xsl driven CSV conversions (CSV to XML, CSV to HTML, CSV to EDI, CSV to *new format*)
>4. CSVFilter as that for FileFilter
>   column range, column width range, row range
>
>
>Regards
>Chetan
>
>  
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: CSV parsing/writing?

Thomas Dudziak
In reply to this post by Chetan Sahasrabudhe
On 5/26/05, Chetan Sahasrabudhe <[hidden email]> wrote:

> I guess CSV commons is a very good suggestion.
> This kind of thing is very much required in data staging and processing.
> If java needs to work with commercial products to process huge amount of data
> then I would suggest to start the initiative on developing something for CSV processing.
>
> On initial thought I can think of following feature list.
>
> CSV Read
> 1. configurable column selection.
> 2. Hibernate / struts property driven CSV read configuration. (Here I am talking about referencing third party xml elements as target references.)
> 3. xsl driven CSV conversions (CSV to XML, CSV to HTML, CSV to EDI, CSV to *new format*)
> 4. CSVFilter as that for FileFilter
>    column range, column width range, row range

I don't know about a commons component specifically for
reading/writing CSV. This might be better solved by using something
like the CsvJdbc JDBC driver:

http://csvjdbc.sourceforge.net/

Btw, reading CSV via a parser generator like Antlr is rather easy.
There is for instance this sample here:

http://supportweb.cs.bham.ac.uk/documentation/tutorials/docsystem/build/tutorials/antlr/antlr.html#ANTLR-Translation-Example

Tom

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: CSV parsing/writing?

Simon Kitching
In reply to this post by Chetan Sahasrabudhe
If the goal of the project is small, ie just a class to parse csv, then
commons-io, commons-codec, commons-lang are the obvious parties. So it's
a matter of seeing if the committers on those projects are interested.

If the goal is larger, ie creating a new commons component itself then
it is likely to be hard work. The way things usually become commons
components is that they are initially a successful part of some other
successful apache project and are spun off into a separate component
here. So one solution might be to find an apache project that would find
csv functionality useful, and then get the developers of that project to
join commons and become the "mentors" of a csv (or more ambitious)
project here.

Projects that might find csv handling useful include
 * workflow projects
 * B2B projects (geronimo?)
 * data import/export: POI?

It seems clear from the mails here that although there is some user
interest in this, there just aren't any existing committers willing to
dedicate the necessary time to mentoring this new project.

As another alternative, a project can be created on Sourceforge, using
the Apache Public License (APL). That way, apache projects like the ones
listed above can happily use the code if they find a need to process csv
in the future. And at that point, friendly discussions might occur about
moving the project to apache commons.

Apache commons really isn't in the same business as sourceforge. This
means that not every good idea gets a home here. Or to look at it the
other way, if it doesn't find a home here that doesn't mean it isn't a
good idea.

(man, csv is a hard acronym to type. At least half the time it comes out
cvs :-).


Cheers,

Simon


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: CSV parsing/writing?

Chetan Sahasrabudhe
In reply to this post by Don Seiler
For long I have been looking for high performance solution to process delimited data.

To put forward the problem in performance please consider this example

one,two,three,four,five ---- (, is used as delimiter)

while parsing this string one needs to traverse char by char to find the delimiter and later act on the segment.

I am trying to figure out the solution that shall simulate normal human reading scenario.
Humans change reading habits as we read more.

initially we read char by char then make a word in our brain and attach meaning to the same.
Now as we grow ole we start picking 2 to 3 words in one read and process it pretty fast.

The point I am trying to make here is, can we make our code more intelligent to take snapshot of data and identify pattern.
I know this sounds pretty hazy but some way to stop parsing char by char and develop algo that shall read the memory block in chunk and identify if there are any delimiters in the chunk. if delimiter is found then parse char by char to get the position.

take a small test here, count number of commas in each row

one,tw
one,two,thr
one,,,two

while looking at this test data did you do char by char parsing or snapshot reading

Regards
Chetan


-----Original Message-----
From: Simon Kitching [mailto:[hidden email]]
Sent: Thursday, May 26, 2005 12:23 PM
To: Jakarta Commons Users List
Subject: RE: CSV parsing/writing?


If the goal of the project is small, ie just a class to parse csv, then
commons-io, commons-codec, commons-lang are the obvious parties. So it's
a matter of seeing if the committers on those projects are interested.

If the goal is larger, ie creating a new commons component itself then
it is likely to be hard work. The way things usually become commons
components is that they are initially a successful part of some other
successful apache project and are spun off into a separate component
here. So one solution might be to find an apache project that would find
csv functionality useful, and then get the developers of that project to
join commons and become the "mentors" of a csv (or more ambitious)
project here.

Projects that might find csv handling useful include
 * workflow projects
 * B2B projects (geronimo?)
 * data import/export: POI?

It seems clear from the mails here that although there is some user
interest in this, there just aren't any existing committers willing to
dedicate the necessary time to mentoring this new project.

As another alternative, a project can be created on Sourceforge, using
the Apache Public License (APL). That way, apache projects like the ones
listed above can happily use the code if they find a need to process csv
in the future. And at that point, friendly discussions might occur about
moving the project to apache commons.

Apache commons really isn't in the same business as sourceforge. This
means that not every good idea gets a home here. Or to look at it the
other way, if it doesn't find a home here that doesn't mean it isn't a
good idea.

(man, csv is a hard acronym to type. At least half the time it comes out
cvs :-).


Cheers,

Simon


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]



---------------------------------
This message contains the information that may be privileged and is  the property of the KPIT Cummins Infosystems LTD.It is intended only for the person to whom it is addressed. If you are not intended recipient, you are not authorized to read, print , retain copy, disseminate, distribute, or use this message or any part thereof. If you receive this message in error, please notify the sender immediately and delete all copies of this message. KPIT Cummins does not accept any liability for virus infected mails.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: CSV parsing/writing?

Andy Lewis
In reply to this post by Chetan Sahasrabudhe

Neat thoughts - but CSV to EDI translation is not a trivial task like
parsing a delimited file. EDI is a highly structured, hierarchical
format. CSV is a simply structured flat format. If you want to have
tools to translate between flat, hierarchical (and/or relational) data,
you are talking about writing a extract-transform-load system, not just
a CSV parser.  HUGE difference in scope.

I'd recommend focusing on a delimited file parser for now.

Chetan Sahasrabudhe wrote:

>I guess CSV commons is a very good suggestion.
>This kind of thing is very much required in data staging and processing.
>If java needs to work with commercial products to process huge amount of data
>then I would suggest to start the initiative on developing something for CSV processing.
>
>On initial thought I can think of following feature list.
>
>CSV Read
>1. configurable column selection.
>2. Hibernate / struts property driven CSV read configuration. (Here I am talking about referencing third party xml elements as target references.)
>3. xsl driven CSV conversions (CSV to XML, CSV to HTML, CSV to EDI, CSV to *new format*)
>4. CSVFilter as that for FileFilter
>   column range, column width range, row range
>
>
>Regards
>Chetan
>
>
>
>-----Original Message-----
>From: Martin Cooper [mailto:[hidden email]]
>Sent: Thursday, May 26, 2005 11:53 AM
>To: [hidden email]; Jakarta Commons Users List
>Subject: Re: CSV parsing/writing?
>
>
>On 5/25/05, Catalin Grigoroscuta <[hidden email]> wrote:
>  
>
>>No need to re-invent the wheel, try ostermiller CSV parser (see
>>ostermiller.org) - open  source, GPL licence.
>>It works fine for me.
>>    
>>
>
>A GPL license might be fine for people who want to pick up this
>package and include it in their applications. However, the GPL is
>fundamentally incompatible with the ASL, so it's not something we
>could pick up and include in any Jakarta Commons component.
>
>--
>Martin Cooper
>
>
>  
>
>>Don Seiler wrote:
>>
>>    
>>
>>>Afternoon.  Just writing to ask if anyone knows of any commons/jakarta
>>>packages that may do CSV parsing and writing.  I'm aware of the jcsv
>>>package but thought I would try and utilize commons as much as possible.
>>>I looked at jakarta-oro as well but don't seem to see anything CSV
>>>related.
>>>
>>>Thanks in advance.
>>>
>>>
>>>      
>>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: [hidden email]
>>For additional commands, e-mail: [hidden email]
>>
>>
>>    
>>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: [hidden email]
>For additional commands, e-mail: [hidden email]
>
>
>
>---------------------------------
>This message contains the information that may be privileged and is  the property of the KPIT Cummins Infosystems LTD.It is intended only for the person to whom it is addressed. If you are not intended recipient, you are not authorized to read, print , retain copy, disseminate, distribute, or use this message or any part thereof. If you receive this message in error, please notify the sender immediately and delete all copies of this message. KPIT Cummins does not accept any liability for virus infected mails.
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: [hidden email]
>For additional commands, e-mail: [hidden email]
>
>  
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: CSV parsing/writing?

Don Seiler
In reply to this post by Catalin Grigoroscuta
On 09:49 Thu 26 May     , Catalin Grigoroscuta wrote:
> The features you describe here are very nice indeed, but I think that
> most people would prefer a simple but fully functional CSV reader and
> writer that would be finished like yesterday, and a future version with
> hibernate/struts/xsl/EDI/whatever.
>
> I would definitely vote for a quick functional implementation (to be
> ready in at most one week), and then let the users decide what features
> they mostly need.

These are my thoughts as well.  I need something relatively quick, and a
basic parser/writer should be doable in a week.  I want to provide at
least the same functionality as the Text::CSV_XS module in perl.

I would be willing to start an sf.net project with ASL for license, and
then jakarta would be more than welcome to do what they want with it.
Any and all are welcome to join as well.

What do you folks think?

P.S. - And, yes, I saw Ostermiller's package.  I'm more interested in
helping out jakarta with whatever my meager skills can provide.

--
Don Seiler
[hidden email]

Public Key: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xFC87F041
Fingerprint: 0B56 50D5 E91E 4D4C 83B7  207C 76AC 5DA2 FC87 F041

attachment0 (196 bytes) Download Attachment
123