[VFS] VFS.getManager().getFilesCache() - exists() vs findFiles()

classic Classic list List threaded Threaded
19 messages Options
Reply | Threaded
Open this post in threaded view
|

[VFS] VFS.getManager().getFilesCache() - exists() vs findFiles()

manco
Hi,
 
I am using VFS in an application running in its own jvm. It is acting as a
transfer agent between a producer of files and a consumer. The producer
async. puts files in a dir and the consumer expects the files to show up async.
in a prespecified destination dir.
 
The producer is local and the consumer is remote via sftp. I use RegEx
to list the src/producer files:
 
public List getFiles() throws Exception
{
...
      private FileObject[]  srcFiles;
 
       FileFilter fileFilter = (FileFilter)new VfsFileFilter(oprnd.getRegEx());
       FileFilterSelector selector = new FileFilterSelector(fileFilter);
       srcFiles = srcFileObj.findFiles(selector);
       list = Arrays.asList(srcFiles);
...
}

this all works fine and dandy!
 
I then iterate throught the list and ck to see if the file exists prior to attempting transfer.
 
 
methodName ()
{
...
           // Looks like external updates to the filesystem are not picked
           // up, unless we clear the VFS cache. We were able to do a find
           // files from the Regex which means a dir listing was updated,
           // but the exist() method would fail. so now we clear...
            VFS.getManager().getFilesCache().close(); // needed to get things working ...
 
            files = srcTransporter.getFiles();   <<< method shown above
            if(files != null)
            {
               Iterator fileIter = files.iterator();
               while (fileIter.hasNext()) {
                  srcFo = (FileObject) fileIter.next();
                  if (srcFo.exists())                                   <<<<< problem
                   {
                      mv /cp SRC to DEST
                  }
...
}
 
Things work fine as long as my producer NEVER produces a file of the SAME name twice.
I call the above method from a loop, where I call the method() then sleep() and then repeat ...
 
I test this by starting the transfer app and then manually moving files into the SRC directory.
Then I watch to see if they wind up in the Dest dir. As long as I dont put the same filename
back into the SRC dir things work fine. However, if I put  TestFileX.ext into the src dir again
it shows up in the FileObject.findFiles() output list, which tells me somebody knows its there,
but it FAILS the   FileObject.exists() method! So somewhere there is a disconnect.
 
I was able to get around the problem by inserting the following line to clear the cache
            VFS.getManager().getFilesCache().close(); // needed to get things working ...
 
I dont see why the findFiles() sees it but the exist() fails
 
Manco
 


 

               
---------------------------------
Discover Yahoo!
 Have fun online with music videos, cool games, IM & more. Check it out!
Reply | Threaded
Open this post in threaded view
|

Re: [VFS] VFS.getManager().getFilesCache() - exists() vs findFiles()

Mario Ivankovits
manco wrote:
>
>I dont see why the findFiles() sees it but the exist() fails
>  
It has something to do with the cache, this is correct (and it drives me
crazy, but this will be something for vfs 2.0).
findFiles finds the file on the filesystem but get the cached object
which might be in state "file deleted".

To refresh its cached informations you could call ".close()" on the
fileobject in question. This is the intended way to do this.

e.g.

for (FileObject fo : foundFiles)
{
    fo.close();
    if (fo.exists())
    {
       ......
    }
}

---
Mario


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [VFS] VFS.getManager().getFilesCache() - exists() vs findFiles()

Andy Lewis
Wouldn't it make sense to have findFiles() update cache entries? Or is
that the plan for 2.0?


Mario Ivankovits wrote:

> manco wrote:
>
>>
>> I dont see why the findFiles() sees it but the exist() fails
>>  
>
> It has something to do with the cache, this is correct (and it drives
> me crazy, but this will be something for vfs 2.0).
> findFiles finds the file on the filesystem but get the cached object
> which might be in state "file deleted".
>
> To refresh its cached informations you could call ".close()" on the
> fileobject in question. This is the intended way to do this.
>
> e.g.
>
> for (FileObject fo : foundFiles)
> {
>    fo.close();
>    if (fo.exists())
>    {
>       ......
>    }
> }
>
> ---
> Mario
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [VFS] VFS.getManager().getFilesCache() - exists() vs findFiles()

Mario Ivankovits
Andy Lewis wrote:
>Wouldn't it make sense to have findFiles() update cache entries? Or is
>that the plan for 2.0?
>  
I dont know exactly in which direction I will go with the caching stuff.

Currently I think about refactoring VFS to be able to run without
caching at all, and thus its planned for version 2.0
Currently there is a NullCache, but this introduce some memory leaks.

Though, I take your suggestion into account and will see how this could
work, maybe this is something for 1.1



Wouldnt it be nice to be able to write code like:

FileObject fo = vfs.resolveFile(....)
while (!fo.exists())
{
    wait
}

Sure, then exists() needs to hit the server every time, but today this
might not be that problem. And if the user needs caching it might be
possible to wrap a FileObject into a ChachedFileObject.
That could make the whole caching cleaner ....

And from my point of view this is exactly what I would like to get from VFS.
In VFS there are some glitches which makes it hard to fix it. But hey,
we need some work for a 2.0 release.

Every comment is still welcome!
I need to see what users expect from VFS.

---
Mario


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [VFS] VFS.getManager().getFilesCache() - exists() vs findFiles()

manco
In reply to this post by manco
ok,  I was thinking the cache was some type of Singleton per jvm, but it sounds like from
your answer that it is FileObject specific. I didn't realize that I could close() a FileObject
and still use it, but the doc below tells otherwise.
 
vfs  FileObject.close() javadoc
"
Closes this file, and its content. This method is a hint to the implementation that it can release any resources asociated with the file.


The file object can continue to be used after this method is called.
..."

thanks,
Manco

Mario Ivankovits <[hidden email]> wrote:
manco wrote:
>
>I dont see why the findFiles() sees it but the exist() fails
>
It has something to do with the cache, this is correct (and it drives me
crazy, but this will be something for vfs 2.0).
findFiles finds the file on the filesystem but get the cached object
which might be in state "file deleted".

To refresh its cached informations you could call ".close()" on the
fileobject in question. This is the intended way to do this.

e.g.

for (FileObject fo : foundFiles)
{
fo.close();
if (fo.exists())
{
......
}
}

---
Mario


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]


               
---------------------------------
Do you Yahoo!?
 Yahoo! Small Business - Try our new resources site!
Reply | Threaded
Open this post in threaded view
|

Re: [VFS] VFS.getManager().getFilesCache() - exists() vs findFiles()

Mario Ivankovits
manco wrote:
>ok,  I was thinking the cache was some type of Singleton per jvm,
It is a singleton per VFS Manager.
>but it sounds like from
>your answer that it is FileObject specific.
No, then I was not clear enough.

VFS do have 2 "caches".
1) One which holds the fileObjects
2) and a second is the fileObject which caches its own state.

now a resolveFile/findFile utilize the first to return always the same
instance of a fileObject to any thread.
This allows one to synchronize against this instance.

FileObject.close() reset the states of case#2.


Cioa,
Mario


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [VFS] VFS.getManager().getFilesCache() - exists() vs findFiles()

Andy Lewis

Taking a cursory look through the code, I have a couple of thoughts.

First, what about adding a stat() method on FileObject, something akin
to the SftpFileOjbect.statSelf(). I know that is an API change, and
while this doesn't exist in many of the providers, and may not be
relevant for some, it is a pretty "normal" file system operation. Adding
a default empty implementation to AbstractFileObject() would cover most
implementations. From a logical standpoint, vfs deals with remote
systems, so the need to periodically refesh that attribute and status of
a file should not be an uncommon thing. I can see many reasons for
wanting to refresh the state though. Then if any FileObject is resolved
to the cache, and appears to be deleted, stat() could be called before
returning it to insure the state is correct when the caller gets it.

As a simpler, more immediate "hack" fix - it looks like it might be
possible to fix the specific case of FileObject.findFiles(). Since
AbstractFileObject.findFiles() relies on getChildren(), which gets the
names and then resolves the file, it should be posible to assume that
any filename returned in the traverse() is not deleted, and for a status
change immeidately after the resolveFile() if it is marked deleted. This
would be an ideal time to call a FileObject.stat() routine if we had one.

While it might not solve every case, for example, if you resolve a
cached deleted file from a URL using FileSystemManager.resolveFile() -
how would that show up? (haven't tested it)

Anyhow, just some thoughts - I have spent realtively little time in the
vfs code, although I have used it quite a bit - AWESOME bit of work in
my opnion, with HUGE potential for the future. I wish I had more time to
help with it....


Mario Ivankovits wrote:

> manco wrote:
>
>> ok,  I was thinking the cache was some type of Singleton per jvm,
>
> It is a singleton per VFS Manager.
>
>> but it sounds like from
>> your answer that it is FileObject specific.
>
> No, then I was not clear enough.
>
> VFS do have 2 "caches".
> 1) One which holds the fileObjects
> 2) and a second is the fileObject which caches its own state.
>
> now a resolveFile/findFile utilize the first to return always the same
> instance of a fileObject to any thread.
> This allows one to synchronize against this instance.
>
> FileObject.close() reset the states of case#2.
>
>
> Cioa,
> Mario
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [VFS] VFS.getManager().getFilesCache() - exists() vs findFiles()

Mario Ivankovits
Andy Lewis wrote:
>First, what about adding a stat() method on FileObject, something akin
>  
I dont want to introduce a new public api function.

Especially if I think that I have to call stat() immediatly after
resolveFile() every time as I am not interested in performance but more
in correct results.
My current code is flooded with nasty close() (instead of stat()) as
this is the method which is intended to to the same as stat().

Sure, I know, it closes all streams (opened by the calling thread), but
this is not a problem in this context, moreover it is wanted to behave so.
This forces a clean state on the remote side.

I admit, what we can do is to force this during findFiles() ... will see.

>While it might not solve every case, for example, if you resolve a
>cached deleted file from a URL using FileSystemManager.resolveFile() -
>how would that show up? (haven't tested it)
>  
No server round trip here, so this file is still marked as "deleted".


I will have a look in refactoring FileObject to use no internal caching
and allow to decorate it with a CacheFileObject with behaves more or
less like we see it today.
And then, we could have such a CacheFileObject.stat() method.

It should be configureable on FileSystemManager and maybe on FileSystem
level.
And the default might be to use the FileObject without caching.

---
Mario


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [VFS] VFS.getManager().getFilesCache() - exists() vs findFiles()

Philippe Poulard
hi,

I have just resolved a problem due to caching and deletion :
-delete()
-close() <- I found this tip in this thread :)

I understand that close() is needed to flush the cache, but I found a
problem in this sequence :
-delete()
-copy()
-set-attributes()

I catch an exception telling me that the file didn't exist ; but it does
exist : i just have create it !!! so, after creation, the cache seems
not be updated

the solution rigth now is :
-delete()
-close()
-copy()
-set-attributes()

but it is not fair

Mario Ivankovits wrote:

> Andy Lewis wrote:
>
>> First, what about adding a stat() method on FileObject, something akin
>>  
>
> I dont want to introduce a new public api function.
>
> Especially if I think that I have to call stat() immediatly after
> resolveFile() every time as I am not interested in performance but more
> in correct results.
> My current code is flooded with nasty close() (instead of stat()) as
> this is the method which is intended to to the same as stat().
>
> Sure, I know, it closes all streams (opened by the calling thread), but
> this is not a problem in this context, moreover it is wanted to behave so.
> This forces a clean state on the remote side.
>
> I admit, what we can do is to force this during findFiles() ... will see.
>
>> While it might not solve every case, for example, if you resolve a
>> cached deleted file from a URL using FileSystemManager.resolveFile() -
>> how would that show up? (haven't tested it)
>>  
>
> No server round trip here, so this file is still marked as "deleted".
>
>
> I will have a look in refactoring FileObject to use no internal caching
> and allow to decorate it with a CacheFileObject with behaves more or
> less like we see it today.
> And then, we could have such a CacheFileObject.stat() method.
>
> It should be configureable on FileSystemManager and maybe on FileSystem
> level.
> And the default might be to use the FileObject without caching.
>
> ---
> Mario
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>


--
Cordialement,

            ///
           (. .)
  -----ooO--(_)--Ooo-----
|   Philippe Poulard    |
  -----------------------

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [VFS] VFS.getManager().getFilesCache() - exists() vs findFiles()

Mario Ivankovits
Philippe Poulard wrote:
> -delete()
> -copy()
> -set-attributes()
>
> I catch an exception telling me that the file didn't exist ; but it
> does exist : i just have create it !!! so, after creation, the cache
> seems not be updated
Could you please pack these steps in a runnable java code. With some
asserts which show the problem.
I will have a look at it then.

Thanks!
Mario


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [VFS] VFS.getManager().getFilesCache() - exists() vs findFiles()

Philippe Poulard
Mario Ivankovits wrote:

> Philippe Poulard wrote:
>
>> -delete()
>> -copy()
>> -set-attributes()
>>
>> I catch an exception telling me that the file didn't exist ; but it
>> does exist : i just have create it !!! so, after creation, the cache
>> seems not be updated
>
> Could you please pack these steps in a runnable java code. With some
> asserts which show the problem.
> I will have a look at it then.
>
> Thanks!
> Mario
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>

FileObject target = XFile.getXFile(
"xmldb:xyl://user:[hidden email]/path/to/file.xml" );
target.getContent().setAttribute( "xmldb-resource-type", "XMLResource" );
target.getContent().setAttribute( "cluster", "Raweb2004" );
target.delete();
FileObject src = XFile.getXFile( "file:///path/to/file.xml" );
target.copyFrom( src, Selectors.SELECT_ALL );

this last line throws an exception (see below)

Caused by: org.apache.commons.vfs.FileSystemException: Could not get
attributes "{1}" because it does not exist.

in the implementation of my XMLDBFileObject (returned with the scheme
"xmldb"), I have :

     protected OutputStream doGetOutputStream( boolean append ) throws
FileSystemException {
         return new ByteArrayOutputStream( 1024 ) {
             public void close() throws IOException {
                 try {
                     super.close();
                     byte[] content = toByteArray();
                     Resource resource = getResource();
                     resource.setContent( content );
                     resource.getParentCollection().storeResource(
resource );
                 } catch ( XMLDBException xdbe ) {
                     throw new FileSystemException( xdbe );
                 }
             }
         };
     }
     Resource getResource() throws FileSystemException, XMLDBException {
         boolean isXML = true;
         Collection coll = DatabaseManager.getCollection(
             XMLDBFileObject.this.getParent().getName().getURI()
         );
         for ( Iterator it =
XMLDBFileObject.this.getContent().getAttributes().entrySet().iterator()
; it.hasNext() ; ) {
             /*
              * getAttributes() fails here because the type
              * is FileType.IMAGINARY : it has been set when
              * delete() has been called
              * all the stuff works if i end the doDelete()
              * with close()
              * If target.delete() is not called, the type
              * is not FileType.IMAGINARY and attributes
              * can be used and it works
              */
         );
         Resource resource = coll.createResource(
             XMLDBFileObject.this.getName().getBaseName(),
             isXML ? XML_RESOURCE_TYPE : BINARY_RESOURCE_TYPE
         );
         return resource;
     }


Exception in thread "main" org.apache.commons.vfs.FileSystemException:
Could not copy "file:///path/to/file.xml" to
"xmldb:xyl://user:[hidden email]/path/to/file.xml".
        at org.inria.reflex.modules.io.XFile.copyFrom(XFile.java:884)
        at Test.main(Test.java:122)
Caused by: org.apache.commons.vfs.FileSystemException: Could not close
the output stream for file
"xmldb:xyl://user:[hidden email]/path/to/file.xml".
        at
org.apache.commons.vfs.provider.DefaultFileContent$FileContentOutputStream.close(DefaultFileContent.java:504)
        at org.apache.commons.vfs.FileUtil.copyContent(FileUtil.java:106)
        at org.inria.reflex.modules.io.XFile.copyFrom(XFile.java:879)
        ... 1 more
Caused by: org.apache.commons.vfs.FileSystemException: Could not get
attributes "{1}" because it does not exist.
        at
org.apache.commons.vfs.provider.DefaultFileContent.getAttributes(DefaultFileContent.java:172)
        at
org.inria.reflex.modules.io.xmldb.XMLDBFileObject.getResource(XMLDBFileObject.java:251)
        at
org.inria.reflex.modules.io.xmldb.XMLDBFileObject$2.close(XMLDBFileObject.java:217)
        at java.io.FilterOutputStream.close(FilterOutputStream.java:143)
        at
org.apache.commons.vfs.util.MonitorOutputStream.close(MonitorOutputStream.java:52)
        at
org.apache.commons.vfs.provider.DefaultFileContent$FileContentOutputStream.close(DefaultFileContent.java:500)
        ... 3 more

in fact, as the delete() method cause the problem because it sets the
type to FileType.IMAGINARY, we can't rely on the file type when copying
a content file, because as the target will have content, it is obvious
that its file type must be switched automatically from
FileType.IMAGINARY to FileType.FILE

(i hope that FOLDERs don't intend to have content ? in this case, there
should be something that denotes that hasAttributes() return true)

maybe it should be correct in FileUtil :

     /**
      * Copies the content from a source file to a destination file.
      */
     public static void copyContent(final FileObject srcFile,
                                    final FileObject destFile)
         throws IOException
     {
         // the destfile may be imaginary, so, let's correct this
because we are sure it is a file, now, because we are copying content inside
         destFile.setType( FileType.FILE );

         // Create the output stream via getContent(), to pick up the
         // validation it does
         final OutputStream outstr =
destFile.getContent().getOutputStream();
         try
         {
             writeContent(srcFile, outstr);
         }
         finally
         {
             outstr.close();
         }
     }

--
Cordialement,

            ///
           (. .)
  -----ooO--(_)--Ooo-----
|   Philippe Poulard    |
  -----------------------

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [VFS] VFS.getManager().getFilesCache() - exists() vs findFiles()

Mario Ivankovits
Philippe Poulard wrote:

>     Resource getResource() throws FileSystemException, XMLDBException {
>         boolean isXML = true;
>         Collection coll = DatabaseManager.getCollection(
>             XMLDBFileObject.this.getParent().getName().getURI()
>         );
>         for ( Iterator it =
> XMLDBFileObject.this.getContent().getAttributes().entrySet().iterator()
> ; it.hasNext() ; ) {
>             /*
>              * getAttributes() fails here because the type
>              * is FileType.IMAGINARY : it has been set when
>              * delete() has been called
So the problem is that imaginray files are not allowed to have attributes.
Why not simply avoid the loop over the attributes if the file is of type
IMAGINARY?

I also do not fully understandy why it works if you call close() after
delete() - during the attach() the file should still be IMAGINARY as it
is deleted.
Might it be that your FileObject.doGetType() do not correctly report the
type of the file?


---
Mario


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [VFS] VFS.getManager().getFilesCache() - exists() vs findFiles()

Philippe Poulard
Mario Ivankovits wrote:

> Philippe Poulard wrote:
>
>>     Resource getResource() throws FileSystemException, XMLDBException {
>>         boolean isXML = true;
>>         Collection coll = DatabaseManager.getCollection(
>>             XMLDBFileObject.this.getParent().getName().getURI()
>>         );
>>         for ( Iterator it =
>> XMLDBFileObject.this.getContent().getAttributes().entrySet().iterator()
>> ; it.hasNext() ; ) {
>>             /*
>>              * getAttributes() fails here because the type
>>              * is FileType.IMAGINARY : it has been set when
>>              * delete() has been called
>
> So the problem is that imaginray files are not allowed to have attributes.
> Why not simply avoid the loop over the attributes if the file is of type
> IMAGINARY?

i need these attributes to create the file : XML:DB oblige to indicate
if a file is XML or binary ; other attributes may also be required
according to the provider ; so, if IMAGINARY was a type that would
accept attributes, all would work fine

thus when i perform a copy, i know that the target is intending to be a
FILE, so i could use attributes ; but if it has been previously deleted
and marked IMAGINARY, its attributes are kept but can't be used

i don't know why IMAGINARY files can't have attributes ; if the file
system is said that attributes are supported, why blocking their usage
with the type ??? it's not fair !

a smart solution for me is to override the method that checks the type
and throws an exception, but i'd prefer a change on the IMAGINARY type

>
> I also do not fully understandy why it works if you call close() after
> delete() - during the attach() the file should still be IMAGINARY as it
> is deleted.
> Might it be that your FileObject.doGetType() do not correctly report the
> type of the file?
>

right : if it is not a FOLDER, it is a FILE ; that's why it works
i will fix it...

--
Cordialement,

            ///
           (. .)
  -----ooO--(_)--Ooo-----
|   Philippe Poulard    |
  -----------------------

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [VFS] VFS.getManager().getFilesCache() - exists() vs findFiles()

Mario Ivankovits
Philippe Poulard wrote:
> i need these attributes to create the file : XML:DB oblige to indicate
> if a file is XML or binary ; other attributes may also be required
> according to the provider ; so, if IMAGINARY was a type that would
> accept attributes, all would work fine
Hmmm ... I am not happy with this.

Sure, this is a small and easy change but it breaks VFS philosphy: A
single and consistent api to access files.
Why? - You now have to know which filesystem you access and maybe setup
some (not defined by the fs) attributes to correctly create/read/write
files.

Say if I would like to move from a FTP: store to XMLDB: I would like to
be able to do this without the need to change the whole application
(e.g. set attributes before file creation)


When exactly do you need those attributes?
If we need to distinguish between binary/text files wouldnt it be better
we provide a configuration to map filename-extensions/content-types to
its type?

Which attributes else do you need?

---
Mario


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [VFS] VFS.getManager().getFilesCache() - exists() vs findFiles()

Philippe Poulard
Mario Ivankovits wrote:

> Philippe Poulard wrote:
>
>> i need these attributes to create the file : XML:DB oblige to indicate
>> if a file is XML or binary ; other attributes may also be required
>> according to the provider ; so, if IMAGINARY was a type that would
>> accept attributes, all would work fine
>
> Hmmm ... I am not happy with this.
>
> Sure, this is a small and easy change but it breaks VFS philosphy: A
> single and consistent api to access files.
> Why? - You now have to know which filesystem you access and maybe setup
> some (not defined by the fs) attributes to correctly create/read/write
> files.

an XML:DB "file system" is somewhat exotic ; an URI is just a canonical
expression of the representation of a file on a single string, but the
fields are known : scheme, host, port, user, pwd, path etc ; the
question is how to deal with additional informations required by the
scheme provider but that can't be canonicalized ? the solution is to
accept parameters

>
> Say if I would like to move from a FTP: store to XMLDB: I would like to
> be able to do this without the need to change the whole application
> (e.g. set attributes before file creation)

these attributes are tightly coupled to this scheme
so if you have "xmldb:provider://..." somewhere, you also have attribute
settings beside
thus, when you make the switch from xmldb to -say- ftp, this includes to
clean attributes becoming irrelevant

example:

with active tags i write it like this:
<io:file name="target"
file-name="xmldb:xyl://user:[hidden email]/path/to/file.xml">
     <!--needed by XML:DB-->
     <xcl:param name="xmldb-resource-type" value="XMLResource"/>
     <!--needed by Xyleme-->
     <xcl:param name="cluster" value="Raweb2004"/>
</io:file>

if i decide to move to ftp, i change like this:
<io:file name="target"
file-name="ftp://user:[hidden email]/path/to/file.xml"/>

>
>
> When exactly do you need those attributes?
> If we need to distinguish between binary/text files wouldnt it be better
> we provide a configuration to map filename-extensions/content-types to
> its type?

this implies that we must know which types we intend to use ; with XML,
there is many and many usual extensions and if a new extension is
encountered, what do we do ? eventually, we could attempt to parse the
file and if it fails, we assume that it's a binary file ; but how can we
distinguish a broken XML file (that we should't store) with a binary one
  (that would also store broken XML) ?
so, we can't based upon the content or name or content-type to decide
which kind of file we are dealing with : this must be driven by the
application

>
> Which attributes else do you need?
>

many others, that depend on the XML:DB provider
with Xyleme Zone Server, i have for example :
-the cluster name
-the attachment n?
-the timeout
-the mode (pipeline or document)
etc

some could be "normalized" by VFS ; for example, the "timeout" is a
concept shared by remote file systems
why not define the "org.apache.commons.VFS.attributes.timeout" attribute ?
same thing for other providers ; this would avoid name conflicts
--
Cordialement,

            ///
           (. .)
  -----ooO--(_)--Ooo-----
|   Philippe Poulard    |
  -----------------------

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [VFS] VFS.getManager().getFilesCache() - exists() vs findFiles()

Mario Ivankovits
Philippe Poulard wrote:
> these attributes are tightly coupled to this scheme
> so if you have "xmldb:provider://..." somewhere, you also have
> attribute settings beside
> thus, when you make the switch from xmldb to -say- ftp, this includes
> to clean attributes becoming irrelevant
Is it possible to separate those attributes into
* filesystem attributes
* file attributes
?

I ask as e.g. the "cluster" sounds more like a configuration per
filesystem instance.
VFS currently supports such filesystem attributes and one thing we can
do here is to register a virtual scheme to a set of filesystem attributes:
say:
xmldbRaweb2004:xyl://user:[hidden email]/path/to/file.xml

would point to the scheme "xmldb" with the desired filesystem attributes
attached.


For the "file attributes" I will take some time to think about them.
What if we create a new method "FileObject.createFile(Map attributes)"?
That way there is no need to change the contract of the IMAGINARY file
type and it makes clear that if you would like to create a new file with
special attributes you have to call that method.

Ciao,
Mario


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [VFS] VFS.getManager().getFilesCache() - exists() vs findFiles()

Philippe Poulard
Mario Ivankovits wrote:

> Philippe Poulard wrote:
>
>> these attributes are tightly coupled to this scheme
>> so if you have "xmldb:provider://..." somewhere, you also have
>> attribute settings beside
>> thus, when you make the switch from xmldb to -say- ftp, this includes
>> to clean attributes becoming irrelevant
>
> Is it possible to separate those attributes into
> * filesystem attributes
> * file attributes
> ?
>
> I ask as e.g. the "cluster" sounds more like a configuration per
> filesystem instance.

xmldbRaweb2004:xyl://user:[hidden email]/path/to/file.xml
^^^^^^^^^^^^^^
    this is not the xmldb scheme, as recommended by XML:DB

the same provider may act on several clusters, but in some
circumstances, it is not needed ; so we could have :
xmldb:xyl:Raweb2004//user:[hidden email]/path/to/file.xml
xmldb:xyl:Raweb2005//user:[hidden email]/path/to/file.xml
xmldb:xyl://user:[hidden email]/path/to/file.xml

it's a pain to add such a layer :(

> VFS currently supports such filesystem attributes and one thing we can
> do here is to register a virtual scheme to a set of filesystem attributes:
> say:
> xmldbRaweb2004:xyl://user:[hidden email]/path/to/file.xml
>
> would point to the scheme "xmldb" with the desired filesystem attributes
> attached.
>
>
> For the "file attributes" I will take some time to think about them.
> What if we create a new method "FileObject.createFile(Map attributes)"?
> That way there is no need to change the contract of the IMAGINARY file
> type and it makes clear that if you would like to create a new file with
> special attributes you have to call that method.

in this case, because of the particularity of all this stuff, it will be
more suitable to override the getAttributes() method (or another, i will
look the code) to perform the operation without checking the file type

--
Cordialement,

            ///
           (. .)
  -----ooO--(_)--Ooo-----
|   Philippe Poulard    |
  -----------------------

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [VFS] VFS.getManager().getFilesCache() - exists() vs findFiles()

Mario Ivankovits
Philippe Poulard wrote:
> xmldbRaweb2004:xyl://user:[hidden email]/path/to/file.xml
> ^^^^^^^^^^^^^^
>    this is not the xmldb scheme, as recommended by XML:DB
I didnt mean to rename the provider, but to provide a mechanism where
one can tie a set of FileSystemOptions to a specific provider.
However, there is also another way to solve this. You can pass the
FileSystemOptions to the resolveFile method.

FileObject fo = FileSystemManager.resolveFile(String name,
FileSystemOptions fileSystemOptions)
Any subsequent call to "fo" (e.g. FileObject.resolveFile) will be able
to access these options.

>> What if we create a new method "FileObject.createFile(Map attributes)"?
>> That way there is no need to change the contract of the IMAGINARY
>> file type and it makes clear that if you would like to create a new
>> file with special attributes you have to call that method.
> in this case, because of the particularity of all this stuff, it will
> be more suitable to override the getAttributes() method (or another, i
> will look the code) to perform the operation without checking the file
> type
The best is to find a clean solution. The http filesystem might also
profit from it as it might be needet to send the content-type of a file
to the sever.

The problem with the solution to allow IMAGINARY Files to have
attributes is the setAttribute Method. Its passed down to
AbstractFileObject.doSetAttribute(final String atttrName, final Object
value)
which might immediately access the filesystem. But with IMAGNIARY files
there is no file where the filesystem can attach those attribute then.

So why is it that bad to have a "FileObject.createFile/Folder(Map
attributes)" isnt it a clean entry point to create a File/Folder with
specific attributes?
Afterwards one can use setAttribute/getAttribute to modify them (if
possible)

Also its needet to separate the attributes into FileSystemAttributes and
FileAttributes.
You might not see any difference when you use your active tags. But
there are differences if you use VFS in your code.

*) You have to pass the FileSystemAttributes only to the first resolveFile.
*) With every different set of FileSystemAttributes VFS will create a
new FileSystem. So if you access two clusters VFS internally maintains
two filesytems.

xmldb:xyl://user:[hidden email]/path/to/file.xml with cluster=Raweb2004
is different to
xmldb:xyl://user:[hidden email]/path/to/file.xml with cluster=Raweb1998


---
Mario


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [VFS] VFS.getManager().getFilesCache() - exists() vs findFiles()

Philippe Poulard
Mario Ivankovits wrote:

> Philippe Poulard wrote:
>
>> xmldbRaweb2004:xyl://user:[hidden email]/path/to/file.xml
>> ^^^^^^^^^^^^^^
>>    this is not the xmldb scheme, as recommended by XML:DB
>
> I didnt mean to rename the provider, but to provide a mechanism where
> one can tie a set of FileSystemOptions to a specific provider.
> However, there is also another way to solve this. You can pass the
> FileSystemOptions to the resolveFile method.
>
> FileObject fo = FileSystemManager.resolveFile(String name,
> FileSystemOptions fileSystemOptions)
> Any subsequent call to "fo" (e.g. FileObject.resolveFile) will be able
> to access these options.
>
>>> What if we create a new method "FileObject.createFile(Map attributes)"?
>>> That way there is no need to change the contract of the IMAGINARY
>>> file type and it makes clear that if you would like to create a new
>>> file with special attributes you have to call that method.
>>
>> in this case, because of the particularity of all this stuff, it will
>> be more suitable to override the getAttributes() method (or another, i
>> will look the code) to perform the operation without checking the file
>> type
>
> The best is to find a clean solution. The http filesystem might also
> profit from it as it might be needet to send the content-type of a file
> to the sever.
>
> The problem with the solution to allow IMAGINARY Files to have
> attributes is the setAttribute Method. Its passed down to
> AbstractFileObject.doSetAttribute(final String atttrName, final Object
> value)
> which might immediately access the filesystem. But with IMAGNIARY files
> there is no file where the filesystem can attach those attribute then.
>
> So why is it that bad to have a "FileObject.createFile/Folder(Map
> attributes)" isnt it a clean entry point to create a File/Folder with
> specific attributes?
> Afterwards one can use setAttribute/getAttribute to modify them (if
> possible)

ok, that's clean

thanks

>
> Also its needet to separate the attributes into FileSystemAttributes and
> FileAttributes.
> You might not see any difference when you use your active tags. But
> there are differences if you use VFS in your code.
>
> *) You have to pass the FileSystemAttributes only to the first resolveFile.
> *) With every different set of FileSystemAttributes VFS will create a
> new FileSystem. So if you access two clusters VFS internally maintains
> two filesytems.
>
> xmldb:xyl://user:[hidden email]/path/to/file.xml with cluster=Raweb2004
> is different to
> xmldb:xyl://user:[hidden email]/path/to/file.xml with cluster=Raweb1998
>

this solution doesn't suit to Xyleme, because the idea of cluster is
somewhat unusual : you need it when you store a file, but you may omit
it when you retrieve it !

--
Cordialement,

            ///
           (. .)
  -----ooO--(_)--Ooo-----
|   Philippe Poulard    |
  -----------------------

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]