no name listed for file contained in a 7z file?

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

no name listed for file contained in a 7z file?

Albretch Mueller-3
 I downloaded this file using wget:

 http://dumps.wikimedia.your.org/other/static_html_dumps/2008-06/en/wikipedia-en-html.tar.7z

 and it seems to be fine:

$ _IFL="wikipedia-en-html.tar.7z"

$ ls -l "${_IFL}"
-rw-r--r-- 1 niggahme niggahme 15363543213 Jun 21  2008 wikipedia-en-html.tar.7z

$ file "${_IFL}"
wikipedia-en-html.tar.7z: 7-zip archive data, version 0.2

$ md5sum -b "${_IFL}"
03ce695cbf32a3f8636fa8d3f9c7d12e *wikipedia-en-html.tar.7z

$ sha256sum -b "${_IFL}"
c2794b6371a05017f03e2a345730fd763b1052872290b5c78763978a0b43c747
*wikipedia-en-html.tar.7z

$ sha512sum -b "${_IFL}"
d52a737ceca25ef18272ba70a4a56000a7a0bff92653fb462674333a0855f397c892b8aeb2e11206d391ba4cca48d46f5814d92db4d2096467519de38c5a189c
*wikipedia-en-html.tar.7z

$ 7z l "${_IFL}"

7-Zip [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,64
bits,2 CPUs Intel(R) Pentium(R) CPU B940 @ 2.00GHz (206A7),ASM)

Scanning the drive for archives:
1 file, 15363543213 bytes (15 GiB)

Listing archive: wikipedia-en-html.tar.7z

--
Path = wikipedia-en-html.tar.7z
Type = 7z
Physical Size = 15363543213
Headers Size = 100
Method = LZMA:22
Solid = -
Blocks = 1

   Date      Time    Attr         Size   Compressed  Name
------------------- ----- ------------ ------------  ------------------------
2008-06-18 13:02:15 ..... 223674511360  15363543113  wikipedia-en-html.tar
------------------- ----- ------------ ------------  ------------------------
2008-06-18 13:02:15       223674511360  15363543113  1 files
$

 But when I try to use apache.compress I can get the name of the
compressed/contained file even though ark and 7z show it to you. Here
is my simple piece of code:

 String aIFl = "wikipedia-en-html.tar.7z";
 File I7ZKFl = new File(aIFl);
 if(I7ZKFl.exists()){
  try{
   SevenZFile SvnZFl = new SevenZFile(I7ZKFl);
   SevenZArchiveEntry entry;
   int iIx = 0;
   while((entry = SvnZFl.getNextEntry()) != null){
    System.out.println("// __ [" + iIx + "]: |" + entry + "|");

    System.out.println("// __ .getName() |" + entry.getName() + "|");
    System.out.println("// __ .getSize() |" + entry.getSize() + "|");
    System.out.println("// __ .getLastModifiedDate() |" +
entry.getLastModifiedDate() + "|");

    ++iIx;
   }// ((entry = SvnZFl.getNextEntry()) != null)
  }catch(IOException IOX){ IOX.printStackTrace(System.err); }
 }

 which, except for the name, its faithful output was:

// __ [0]: |org.apache.commons.compress.archivers.sevenz.SevenZArchiveEntry@179d3b25|
// __ .getName() |null|
// __ .getSize() |223674511360|
// __ .getLastModifiedDate() |Wed Jun 18 14:02:15 EDT 2008|

 Why is it that I can't get the file name?

 Also, if OO works, I should be able to access and process this file
while addressing it like (using an exclamation mark):

 wikipedia-en-html.tar.7z!wikipedia-en-html.tar

 So, I this point I should be able to go:

 String aIFl = "wikipedia-en-html.tar.7z!wikipedia-en-html.tar"
 FileInputStream FISTarK = new FileInputStream(new File(aIFl));
 TarArchiveInputStream tarInput = new TarArchiveInputStream(FISTarK);
 TarArchiveEntry tArKEnt;
 while((tArKEnt=tarInput.getNextTarEntry()) != null){
  ...
 }

 right?

 lbrtchx

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: no name listed for file contained in a 7z file?

Albretch Mueller-3
 I CAN'T get the name of the compressed/contained file ...
 lbrtchx

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: no name listed for file contained in a 7z file?

Stefan Bodewig
In reply to this post by Albretch Mueller-3
On 2019-01-21, Albretch Mueller wrote:

>  I downloaded this file using wget:
>
>  http://dumps.wikimedia.your.org/other/static_html_dumps/2008-06/en/wikipedia-en-html.tar.7z
>
>  and it seems to be fine:

...

> $ 7z l "${_IFL}"
>

...

>    Date      Time    Attr         Size   Compressed  Name
> ------------------- ----- ------------ ------------  ------------------------
> 2008-06-18 13:02:15 ..... 223674511360  15363543113  wikipedia-en-html.tar
> ------------------- ----- ------------ ------------  ------------------------
> 2008-06-18 13:02:15       223674511360  15363543113  1 files

>
>  But when I try to use apache.compress I can get the name of the
> compressed/contained file even though ark and 7z show it to you.

I get the same results. It looks as if the archive was doing something
unusual.

Watching Commons Compress while it reads the archive shows the archive
does not contain any file names inside of the "FilesInfo" part which is
why "null" is returned as name. There must be a place where the file
name is stored that is not expected by our code and I'll need time, a
hex editor and maybe a newer version of the format specification in
order to understand what is going on. It is not really helping that the
only known file exhibiting the problem is 15 GB in size :-)

Could you please open a JIRA issue if you want to keep track of any
progress we might make?

...

> Also, if OO works, I should be able to access and process this file
> while addressing it like (using an exclamation mark):
>
>  wikipedia-en-html.tar.7z!wikipedia-en-html.tar
>
>  So, I this point I should be able to go:
>
>  String aIFl = "wikipedia-en-html.tar.7z!wikipedia-en-html.tar"
>  FileInputStream FISTarK = new FileInputStream(new File(aIFl));

This would require the File class to know the exclamation mark syntax,
which it doesn't - and FileInputStream would need to know how to detect
this is an entry inside of a 7z archive and whom to ask when it needs to
extract the entry. This is not how the java.io package works.

Java's URLs would be closer, but most likely you'd need something like
Commons VFS or java.nio.file.FileSystem which have proper abstractions
for a higher level API than the one provided by Compress. Unfortunately
I'm neither of them supports 7z either, though.

Stefan

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: no name listed for file contained in a 7z file?

Albretch Mueller-3
On 1/23/19, Stefan Bodewig <[hidden email]> wrote:
> It is not really helping that the
> only known file exhibiting the problem is 15 GB in size :-)

 There are plenty of those here:

 http://dumps.wikimedia.your.org/other/static_html_dumps/2008-06/

> Could you please open a JIRA issue if you want to keep track of any
> progress we might make?

 I fell out of love with coding some time ago. I have been noticing
that things kept going on. All I read there were words. How do you
open a JIRA issue? Could you at least provide me with a link? I tried
searching my way through it but teachers don't have that much time

 lbrtchx

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: no name listed for file contained in a 7z file?

Matt Sicker
On Mon, 28 Jan 2019 at 18:55, Albretch Mueller <[hidden email]> wrote:
>  I fell out of love with coding some time ago. I have been noticing
> that things kept going on. All I read there were words. How do you
> open a JIRA issue? Could you at least provide me with a link? I tried
> searching my way through it but teachers don't have that much time

Jira: https://issues.apache.org/jira/browse/COMPRESS

--
Matt Sicker <[hidden email]>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: no name listed for file contained in a 7z file?

Albretch Mueller-3
On 1/29/19, Matt Sicker <[hidden email]> wrote:
> Jira: https://issues.apache.org/jira/browse/COMPRESS

 https://issues.apache.org/jira/browse/COMPRESS-478

 lbrtchx

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: no name listed for file contained in a 7z file?

Stefan Bodewig
In reply to this post by Stefan Bodewig
On 2019-01-23, Stefan Bodewig wrote:

> On 2019-01-21, Albretch Mueller wrote:

>>  I downloaded this file using wget:

>>  http://dumps.wikimedia.your.org/other/static_html_dumps/2008-06/en/wikipedia-en-html.tar.7z

>>  and it seems to be fine:

> ...

>> $ 7z l "${_IFL}"


> ...

>>    Date      Time    Attr         Size   Compressed  Name
>> ------------------- ----- ------------ ------------  ------------------------
>> 2008-06-18 13:02:15 ..... 223674511360  15363543113  wikipedia-en-html.tar
>> ------------------- ----- ------------ ------------  ------------------------
>> 2008-06-18 13:02:15       223674511360  15363543113  1 files


>>  But when I try to use apache.compress I can get the name of the
>> compressed/contained file even though ark and 7z show it to you.

> I get the same results. It looks as if the archive was doing something
> unusual.

I'm really sorry it took me seven months to find enough time to dive in
deep enough.

> Watching Commons Compress while it reads the archive shows the archive
> does not contain any file names inside of the "FilesInfo" part which is
> why "null" is returned as name.

This turned out to be true. The archive simply doesn't contain any name
of the entry at all. The 7z command line tool (as well as other "UI"
parts of the 7zip tools) contains a special logic for unnamed entries
and derives a default name from the name of the archive itself

https://github.com/kornelski/7z/blob/master/CPP/7zip/UI/Common/DefaultName.cpp

Compress 1.19 (not released, yet) will contain a getDefaultName method
that contains the same logic and an option to provide the default name
for entries with a null name. It will not be on by default as a decision
like this really should be taken at a layer on top of Compress - much
like it is taken at the UI layer inside of the 7zip tools.

Stefan

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]