# [compress] TarArchiveEntry and Windows drive letter

 Classic List Threaded
10 messages
Reply | Threaded
Open this post in threaded view
|

## [compress] TarArchiveEntry and Windows drive letter

 I just found a new issue with compress. It's the "normalizeFileName" in TarArchiveEntry again. On Windows it just strips the drive letter https://github.com/apache/commons-compress/blob/master/src/main/java/org/apache/commons/compress/archivers/tar/ TarArchiveEntry.java#L1337 which I think is a questionable default behaviour IMO.    TarArchiveEntry("C:\foo\bar") -> "/foo/bar"    TarArchiveEntry("D:\foo\bar") -> "/foo/bar" Is that in line with how other unix tool ports handle things on windows? cheers, Torsten PS: Just for you sebb ;)
Reply | Threaded
Open this post in threaded view
|

## Re: [compress] TarArchiveEntry and Windows drive letter

 I do not see how skipping the drive letter on Windows is not normalizing, it is _moving_.... sounds like a bug to me. Gary On Wed, Jan 3, 2018 at 8:46 AM, Torsten Curdt <[hidden email]> wrote: > I just found a new issue with compress. > > It's the "normalizeFileName" in TarArchiveEntry again. > On Windows it just strips the drive letter > > https://github.com/apache/commons-compress/blob/master/src/> main/java/org/apache/commons/compress/archivers/tar/ > TarArchiveEntry.java#L1337 > > which I think is a questionable default behaviour IMO. > >    TarArchiveEntry("C:\foo\bar") -> "/foo/bar" >    TarArchiveEntry("D:\foo\bar") -> "/foo/bar" > > Is that in line with how other unix tool ports handle things on windows? > > cheers, > Torsten > > PS: Just for you sebb ;) >
Reply | Threaded
Open this post in threaded view
|

## Re: [compress] TarArchiveEntry and Windows drive letter

 In reply to this post by Torsten Curdt-3 On 2018-01-03, Torsten Curdt wrote: > I just found a new issue with compress. > It's the "normalizeFileName" in TarArchiveEntry again. > On Windows it just strips the drive letter > https://github.com/apache/commons-compress/blob/master/src/> main/java/org/apache/commons/compress/archivers/tar/ > TarArchiveEntry.java#L1337 > which I think is a questionable default behaviour IMO. >    TarArchiveEntry("C:\foo\bar") -> "/foo/bar" >    TarArchiveEntry("D:\foo\bar") -> "/foo/bar" "strip the drive letter" predates preserveLeadingSlashes https://github.com/apache/commons-compress/commit/32eea1e611d66a3c1a6b1dc0b373a943614e5680#diff-b7b626a7d92545685d08e93834bfa3d0triggered by https://issues.apache.org/jira/browse/COMPRESS-26and later https://github.com/apache/commons-compress/commit/3e2ddad124ea6ab1bc690487793bf88560ca3732#diff-b7b626a7d92545685d08e93834bfa3d0I could be convinced that we should only strip drive letters when we don't preserve leading slashes either :-) Usually tar archives do not contain absolute path names at all. > Is that in line with how other unix tool ports handle things on windows? No idea, really. Stefan --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email]
Reply | Threaded
Open this post in threaded view
|

## Re: [compress] TarArchiveEntry and Windows drive letter

 > >    TarArchiveEntry("C:\foo\bar") -> "/foo/bar" > >    TarArchiveEntry("D:\foo\bar") -> "/foo/bar" > > "strip the drive letter" predates preserveLeadingSlashes > Might be but I guess the only question is how to fix it without breaking too many things. I am not such a big fan of "preserveLeadingSlashes" either. I could be convinced that we should only strip drive letters when we > don't preserve leading slashes either :-) > IMO there is just no case at all when compress should strip drive letters. Sure it might be convenient (for some) but this is too much magic. Usually tar archives do not contain absolute path names at all. > Which is also why the current constructors have a huge potential of ambiguity and misinterpretation IMO.   TarArchiveEntry(File file)   TarArchiveEntry(File file, String fileName)   TarArchiveEntry(String name)   TarArchiveEntry(String name, boolean preserveLeadingSlashes)   TarArchiveEntry(String name, byte linkFlag)   TarArchiveEntry(String name, byte linkFlag, boolean preserveLeadingSlashes) If it should be relative and you pass a File - relative to what? If I pass in a string there should be no need for "preserveLeadingSlashes" it should be just that string. Just some random thoughts. Maybe worth cleaning up that part of the API. > Is that in line with how other unix tool ports handle things on windows? > > No idea, really. > It might be worth checking with Windows users to see what the expected behaviour might be. But even then - I still think silently stripping is the wrong thing to do. cheers, Torsten
Reply | Threaded
Open this post in threaded view
|

 On 2018-01-05, Torsten Curdt wrote: >>>    TarArchiveEntry("C:\foo\bar") -> "/foo/bar" >>>    TarArchiveEntry("D:\foo\bar") -> "/foo/bar" >> "strip the drive letter" predates preserveLeadingSlashes > Might be but I guess the only question is how to fix it without breaking > too many things. > I am not such a big fan of "preserveLeadingSlashes" either. We inherited that from Ant and it really stems from a rather odd usecase over there. >> I could be convinced that we should only strip drive letters when we >> don't preserve leading slashes either :-) > IMO there is just no case at all when compress should strip drive letters. > Sure it might be convenient (for some) but this is too much magic. Again, I have no idea what tar ports do on Windows. By now all my Windows instances run the Linux subsystem if I want to use tar. This is what GNU tar does on Linux: $tar cf x.tar /tmp tar: Removing leading /' from member names so us stripping leading slashes by default is completely in line with what the CLI tar does. >> Usually tar archives do not contain absolute path names at all. > Which is also why the current constructors have a huge potential of > ambiguity and misinterpretation IMO. I've just now changed the javadocs to explicitly state how names get constructed. I've left out documenting what happens to drive letters so I don't need to change it depending on the outcome of this discussion :-) > If it should be relative and you pass a File - relative to what? It is File.getPath() so it is the string argument passed to the File constructor. > It might be worth checking with Windows users to see what the expected > behaviour might be. Allow me to point out that this has been the behavior of the class for far more than ten years and no Windows user has ever complained. Neither here nor in Ant land. This may indicate the behavior is not too surprising. It may also indicate that most people really only want to use relative file names regardless of platform. > But even then - I still think silently stripping is the wrong thing to > do. The contract of tar archives is they contain relative path names. GNU tar strips leading slashes both when creating archives and when extracting archives who's entry names contain leading slashes. Sticking with that contract I still believe we should remove leading slashes by default - and removing the drive letters is the Windows equivalent of it IMHO. That's why I suggest to keep the drive letter in the preserveLeadingSlashes == true case. Cheers Stefan --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email] Reply | Threaded Open this post in threaded view | ## Re: [compress] TarArchiveEntry and Windows drive letter  On 21 January 2018 at 16:07, Stefan Bodewig <[hidden email]> wrote: > On 2018-01-05, Torsten Curdt wrote: > >>>> TarArchiveEntry("C:\foo\bar") -> "/foo/bar" >>>> TarArchiveEntry("D:\foo\bar") -> "/foo/bar" > >>> "strip the drive letter" predates preserveLeadingSlashes > >> Might be but I guess the only question is how to fix it without breaking >> too many things. >> I am not such a big fan of "preserveLeadingSlashes" either. > > We inherited that from Ant and it really stems from a rather odd usecase > over there. > >>> I could be convinced that we should only strip drive letters when we >>> don't preserve leading slashes either :-) > >> IMO there is just no case at all when compress should strip drive letters. >> Sure it might be convenient (for some) but this is too much magic. > > Again, I have no idea what tar ports do on Windows. By now all my > Windows instances run the Linux subsystem if I want to use tar. > > This is what GNU tar does on Linux: > >$ tar cf x.tar /tmp > tar: Removing leading /' from member names > > so us stripping leading slashes by default is completely in line with > what the CLI tar does. > >>> Usually tar archives do not contain absolute path names at all. > >> Which is also why the current constructors have a huge potential of >> ambiguity and misinterpretation IMO. > > I've just now changed the javadocs to explicitly state how names get > constructed. I've left out documenting what happens to drive letters so > I don't need to change it depending on the outcome of this discussion :-) > >> If it should be relative and you pass a File - relative to what? > > It is File.getPath() so it is the string argument passed to the File > constructor. > >> It might be worth checking with Windows users to see what the expected >> behaviour might be. > > Allow me to point out that this has been the behavior of the class for > far more than ten years and no Windows user has ever complained. > Neither here nor in Ant land. This may indicate the behavior is not too > surprising. It may also indicate that most people really only want to > use relative file names regardless of platform. Any other behaviour is potentially dangerous so must be optional. >> But even then - I still think silently stripping is the wrong thing to >> do. > > The contract of tar archives is they contain relative path names. GNU > tar strips leading slashes both when creating archives and when > extracting archives who's entry names contain leading slashes. Sticking > with that contract I still believe we should remove leading slashes by > default - and removing the drive letters is the Windows equivalent of it > IMHO. That's why I suggest to keep the drive letter in the > preserveLeadingSlashes == true case. I would expect the Windows default behaviour to generate/use relative names only. This means dropping both the drive letter *and* any leading \ or /. Optionally, tar should be able to preserve/use absolute path names. But the user must specify this. > Cheers > >         Stefan > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [hidden email] > For additional commands, e-mail: [hidden email] > --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email]
Reply | Threaded
Open this post in threaded view
|

## Re: [compress] TarArchiveEntry and Windows drive letter

 On 2018-01-21, sebb wrote: > On 21 January 2018 at 16:07, Stefan Bodewig <[hidden email]> wrote: >> The contract of tar archives is they contain relative path names. GNU >> tar strips leading slashes both when creating archives and when >> extracting archives who's entry names contain leading slashes. Sticking >> with that contract I still believe we should remove leading slashes by >> default - and removing the drive letters is the Windows equivalent of it >> IMHO. That's why I suggest to keep the drive letter in the >> preserveLeadingSlashes == true case. > I would expect the Windows default behaviour to generate/use relative > names only. > This means dropping both the drive letter *and* any leading \ or /. This is what happens right now. > Optionally, tar should be able to preserve/use absolute path names. > But the user must specify this. This is what I suggested. Change the meaning of "preserveLeadingSlashes" to "preserveAbsolutePath" and use it to keep the drive letter on Windows. The current code will always strip the drive letter spec. Stefan --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email]
Reply | Threaded
Open this post in threaded view
|

## Re: [compress] TarArchiveEntry and Windows drive letter

 On 22 January 2018 at 09:53, Stefan Bodewig <[hidden email]> wrote: > On 2018-01-21, sebb wrote: > >> On 21 January 2018 at 16:07, Stefan Bodewig <[hidden email]> wrote: > >>> The contract of tar archives is they contain relative path names. GNU >>> tar strips leading slashes both when creating archives and when >>> extracting archives who's entry names contain leading slashes. Sticking >>> with that contract I still believe we should remove leading slashes by >>> default - and removing the drive letters is the Windows equivalent of it >>> IMHO. That's why I suggest to keep the drive letter in the >>> preserveLeadingSlashes == true case. > >> I would expect the Windows default behaviour to generate/use relative >> names only. >> This means dropping both the drive letter *and* any leading \ or /. > > This is what happens right now. > >> Optionally, tar should be able to preserve/use absolute path names. >> But the user must specify this. > > This is what I suggested. Change the meaning of "preserveLeadingSlashes" > to "preserveAbsolutePath" and use it to keep the drive letter on > Windows. > > The current code will always strip the drive letter spec. In that case, I agree with you. Also as a name,  'preserveLeadingSlashes' only makes sense on a unix system, and it describes the processing rather than the functionality. > Stefan > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [hidden email] > For additional commands, e-mail: [hidden email] > --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email]
Reply | Threaded
Open this post in threaded view
|

## Re: [compress] TarArchiveEntry and Windows drive letter

 > > This is what I suggested. Change the meaning of "preserveLeadingSlashes" > > to "preserveAbsolutePath" and use it to keep the drive letter on > > Windows. > > > > The current code will always strip the drive letter spec. > > In that case, I agree with you. > > Also as a name,  'preserveLeadingSlashes' only makes sense on a unix > system, and it describes the processing rather than the functionality. > I guess that could be a reasonable compromise - and certainly much clearer from an API point of view. cheers, Torsten
Reply | Threaded
Open this post in threaded view
|

## Re: [compress] TarArchiveEntry and Windows drive letter

 On 2018-01-28, Torsten Curdt wrote: >>> This is what I suggested. Change the meaning of "preserveLeadingSlashes" >>> to "preserveAbsolutePath" and use it to keep the drive letter on >>> Windows. >>> The current code will always strip the drive letter spec. >> In that case, I agree with you. >> Also as a name,  'preserveLeadingSlashes' only makes sense on a unix >> system, and it describes the processing rather than the functionality. > I guess that could be a reasonable compromise - and certainly much clearer > from an API point of view. Done. Stefan --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email]