[RDF] jena, rdf4j, json-ld integrations

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

[RDF] jena, rdf4j, json-ld integrations

Stian Soiland-Reyes
Hi,

[[ Cross-posting - let's try to reply to dev@commons. ]]


As you might have spotted if you've seen the Commons RDF Jira/commit
emails, I have been working on adding integrations for Jena, RDF4j and
JSONLD-Java on these branches:


https://github.com/apache/incubator-commonsrdf/tree/jena
https://github.com/apache/incubator-commonsrdf/tree/rdf4j
https://github.com/apache/incubator-commonsrdf/tree/jsonld-java


I think they are now nearing completion and so I would suggest we
merge them to master to try to do a 0.3.0-incubating release of
Commons RDF


These include a full RDFTermfactory for each, includes Graph, Triple,
Quad, Dataset and of course the various RDFTerms.

The jena branch also includes support for generalized triples and
generalized quads, implementing the freshly added TripleLike/QuadLike.




See merged javadoc here:

http://stain.github.io/incubator-commonsrdf/integration/

e.g. under "All known implementing classes" at

http://stain.github.io/incubator-commonsrdf/integration/org/apache/commons/rdf/api/RDFTermFactory.html

http://stain.github.io/incubator-commonsrdf/integration/org/apache/commons/rdf/api/Triple.html



Each implementation (except simple) work by wrapping their native classes, e.g.

https://github.com/apache/incubator-commonsrdf/blob/rdf4j/rdf4j/src/main/java/org/apache/commons/rdf/rdf4j/impl/TripleImpl.java

.. which wraps a org.eclipse.rdf4j.model.Statement and constructs the
RDFTerms of its subject/predicate/object on demand.




I've also merged them all in a single branch to test if they are interoperable:

https://github.com/apache/incubator-commonsrdf/tree/jena-jsonld-rdf4j-integration/integration-tests

The tests create RDFTerms/triples in one RDFTermFactory, then add them
to a graph created with a 'foreign' RDFTermFactory, and then this is
tested all-to-all between all factories, including retrieving back
those previously difficult BlankNodes.



Good news, everyone! They all talk to each other!


And what surprised me most was that these three worked well enough
without a classpath issues with any shared dependencies  - I thought
it would be sensitive to say the JSON-LD Java or HTTPClient version -
I guess more testing (in particular of parsing) would find out.






I also used sed to make a  search-replace variant of rdf4j for older
sesame 4 as the big differences are just package names import from
org.openrdf instead of org.eclipse.rdf4j.

https://github.com/stain/incubator-commonsrdf/tree/sesame4

(I didn't commit this to Apache as I'm not sure if we would want to
support such a 'clone' of the rdf4j module, it would require double
maintenance -- but in theory this could be used for interoperability
between sesame4 and rdf4j! :)



Now is the time to discuss some strategy points!

I'll ask about those in separate emails, and I'll copy
[hidden email] as I think we can have good feedback there also
from non-RDF folks - e.g. as we need to agree on common styles etc.

(And we need to practice moving our Commons RDF email traffic to dev@commons)


--
Stian Soiland-Reyes
http://orcid.org/0000-0001-9842-9718

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [RDF] jena, rdf4j, json-ld integrations

Peter Ansell
Hi Stian,

Sesame-4 will not have any more releases due to the Eclipse migration,
so you will not have a large user-base for that. Even maintaining a
Sesame-2.8 module may not find many users, as users who are still
using it for the near future will likely not be migrating to Java-8
and hence won't be integrating Commons RDF as often as those who are
already migrating to Eclipse RDF4J.

Developing an integration for JSONLD-Java in the same way as I have
setup the others with their own repositories is a possibility for that
aspect, similar to:

https://github.com/jsonld-java/jsonld-java-clerezza

https://github.com/jsonld-java/jsonld-java-rdf2go

However, both Jena and Sesame/RDF4J have ended up having their
JSONLD-Java integrations moved back into their core repositories so it
isn't the same for everyone.

Cheers,

Peter

On 13 September 2016 at 09:55, Stian Soiland-Reyes <[hidden email]> wrote:

> Hi,
>
> [[ Cross-posting - let's try to reply to dev@commons. ]]
>
>
> As you might have spotted if you've seen the Commons RDF Jira/commit
> emails, I have been working on adding integrations for Jena, RDF4j and
> JSONLD-Java on these branches:
>
>
> https://github.com/apache/incubator-commonsrdf/tree/jena
> https://github.com/apache/incubator-commonsrdf/tree/rdf4j
> https://github.com/apache/incubator-commonsrdf/tree/jsonld-java
>
>
> I think they are now nearing completion and so I would suggest we
> merge them to master to try to do a 0.3.0-incubating release of
> Commons RDF
>
>
> These include a full RDFTermfactory for each, includes Graph, Triple,
> Quad, Dataset and of course the various RDFTerms.
>
> The jena branch also includes support for generalized triples and
> generalized quads, implementing the freshly added TripleLike/QuadLike.
>
>
>
>
> See merged javadoc here:
>
> http://stain.github.io/incubator-commonsrdf/integration/
>
> e.g. under "All known implementing classes" at
>
> http://stain.github.io/incubator-commonsrdf/integration/org/apache/commons/rdf/api/RDFTermFactory.html
>
> http://stain.github.io/incubator-commonsrdf/integration/org/apache/commons/rdf/api/Triple.html
>
>
>
> Each implementation (except simple) work by wrapping their native classes, e.g.
>
> https://github.com/apache/incubator-commonsrdf/blob/rdf4j/rdf4j/src/main/java/org/apache/commons/rdf/rdf4j/impl/TripleImpl.java
>
> .. which wraps a org.eclipse.rdf4j.model.Statement and constructs the
> RDFTerms of its subject/predicate/object on demand.
>
>
>
>
> I've also merged them all in a single branch to test if they are interoperable:
>
> https://github.com/apache/incubator-commonsrdf/tree/jena-jsonld-rdf4j-integration/integration-tests
>
> The tests create RDFTerms/triples in one RDFTermFactory, then add them
> to a graph created with a 'foreign' RDFTermFactory, and then this is
> tested all-to-all between all factories, including retrieving back
> those previously difficult BlankNodes.
>
>
>
> Good news, everyone! They all talk to each other!
>
>
> And what surprised me most was that these three worked well enough
> without a classpath issues with any shared dependencies  - I thought
> it would be sensitive to say the JSON-LD Java or HTTPClient version -
> I guess more testing (in particular of parsing) would find out.
>
>
>
>
>
>
> I also used sed to make a  search-replace variant of rdf4j for older
> sesame 4 as the big differences are just package names import from
> org.openrdf instead of org.eclipse.rdf4j.
>
> https://github.com/stain/incubator-commonsrdf/tree/sesame4
>
> (I didn't commit this to Apache as I'm not sure if we would want to
> support such a 'clone' of the rdf4j module, it would require double
> maintenance -- but in theory this could be used for interoperability
> between sesame4 and rdf4j! :)
>
>
>
> Now is the time to discuss some strategy points!
>
> I'll ask about those in separate emails, and I'll copy
> [hidden email] as I think we can have good feedback there also
> from non-RDF folks - e.g. as we need to agree on common styles etc.
>
> (And we need to practice moving our Commons RDF email traffic to dev@commons)
>
>
> --
> Stian Soiland-Reyes
> http://orcid.org/0000-0001-9842-9718
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [RDF] jena, rdf4j, json-ld integrations

Stian Soiland-Reyes
On 13 Sep 2016 5:14 a.m., "Peter Ansell" <[hidden email]> wrote:
> Sesame-4 will not have any more releases due to the Eclipse migration,
> so you will not have a large user-base for that. Even maintaining a
> Sesame-2.8 module may not find many users, as users who are still
> using it for the near future will likely not be migrating to Java-8
> and hence won't be integrating Commons RDF as often as those who are
> already migrating to Eclipse RDF4J.

Right, I will probably just release the sesame4 as a private one-off on
GitHub, I think it would only be useful to a narrow set of users who are
"temporarily stuck" with sesame (as I am in one project I don't want to
maintain too much) but want to wriggle their way over to RDF4j via Commons
RDF, say one class/module at the time.

Thankfully rdf4j and sesame live well on the class path together! (But I
guess that could break if RDF 4J moves on to way newer dependencies)

> Developing an integration for JSONLD-Java in the same way as I have
> setup the others with their own repositories is a possibility for that
> aspect, similar to:
>
> https://github.com/jsonld-java/jsonld-java-clerezza
> https://github.com/jsonld-java/jsonld-java-rdf2go

Yes, that could be a possibility, but I think Commons is still not quite
ready for "micro" git repositories as it would complicate web site
maintenance and release processes. it's easy enough to split out the
modules with git subtree, but not if you keep developing on both sides
afterwards. :)

> However, both Jena and Sesame/RDF4J have ended up having their
> JSONLD-Java integrations moved back into their core repositories so it
> isn't the same for everyone.

Yes, I think that would be good goal, so perhaps the JSONLD-Java model
would work here as well, then it does not matter so much "how integrated"
Commons RDF gets where - Commons could still provide an empty "glue" Maven
artifact to depend on such external integrations (or just list them on the
site)

BTW, as I'm currently the sole author of the commonsrdf-rdf4j integration I
could contribute it separately to Eclipse under the EPL, if so desired.
Reply | Threaded
Open this post in threaded view
|

Re: [RDF] jena, rdf4j, json-ld integrations

Peter Ansell
On 15 September 2016 at 12:49, Stian Soiland-Reyes <[hidden email]> wrote:

> On 13 Sep 2016 5:14 a.m., "Peter Ansell" <[hidden email]> wrote:
>> Sesame-4 will not have any more releases due to the Eclipse migration,
>> so you will not have a large user-base for that. Even maintaining a
>> Sesame-2.8 module may not find many users, as users who are still
>> using it for the near future will likely not be migrating to Java-8
>> and hence won't be integrating Commons RDF as often as those who are
>> already migrating to Eclipse RDF4J.
>
> Right, I will probably just release the sesame4 as a private one-off on
> GitHub, I think it would only be useful to a narrow set of users who are
> "temporarily stuck" with sesame (as I am in one project I don't want to
> maintain too much) but want to wriggle their way over to RDF4j via Commons
> RDF, say one class/module at the time.

One of the original goals was to help with migration and
interoperability so if it doesn't then things would need to be
reworked on the Commons RDF side to support that.

> Thankfully rdf4j and sesame live well on the class path together! (But I
> guess that could break if RDF 4J moves on to way newer dependencies)

The main dependencies that are shared and liable to break are the
FasterXML Jackson and Apache HttpClient dependencies that both
semi-regularly break their public APIs at the minor version level and
sometimes at the patch level. In the long term you would need to
isolate RDF4J and Sesame with OSGi or Java-9 modules/etc. to keep them
playing nice together.

>> Developing an integration for JSONLD-Java in the same way as I have
>> setup the others with their own repositories is a possibility for that
>> aspect, similar to:
>>
>> https://github.com/jsonld-java/jsonld-java-clerezza
>> https://github.com/jsonld-java/jsonld-java-rdf2go
>
> Yes, that could be a possibility, but I think Commons is still not quite
> ready for "micro" git repositories as it would complicate web site
> maintenance and release processes. it's easy enough to split out the
> modules with git subtree, but not if you keep developing on both sides
> afterwards. :)

Keeping them together in a single Git repository simplifies things a
lot when you are making regular changes to things. Git
subtrees/submodules are a nightmare in my experience and hopefully you
don't have to go down that track.

>> However, both Jena and Sesame/RDF4J have ended up having their
>> JSONLD-Java integrations moved back into their core repositories so it
>> isn't the same for everyone.
>
> Yes, I think that would be good goal, so perhaps the JSONLD-Java model
> would work here as well, then it does not matter so much "how integrated"
> Commons RDF gets where - Commons could still provide an empty "glue" Maven
> artifact to depend on such external integrations (or just list them on the
> site)

It has worked fairly well for JSONLD-Java so far, which has stabilised
most of its API for now.

> BTW, as I'm currently the sole author of the commonsrdf-rdf4j integration I
> could contribute it separately to Eclipse under the EPL, if so desired.

RDF4J didn't choose to use the EPL, they are using a BSD-style license
that Eclipse also support, but the rest of the Eclipse legal
procedures for contributions are still being used.

https://github.com/eclipse/rdf4j/blob/master/edl-v1.0.txt

Cheers,

Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [RDF] jena, rdf4j, json-ld integrations

Stian Soiland-Reyes
On 15 Sep 2016 5:11 a.m., "Peter Ansell" <[hidden email]> wrote:
> One of the original goals was to help with migration and
> interoperability so if it doesn't then things would need to be
> reworked on the Commons RDF side to support that.

I would hope it does that now :-)

> The main dependencies that are shared and liable to break are the
> FasterXML Jackson and Apache HttpClient dependencies that both
> semi-regularly break their public APIs at the minor version level and
> sometimes at the patch level. In the long term you would need to
> isolate RDF4J and Sesame with OSGi or Java-9 modules/etc. to keep them
> playing nice together.

Yes, both of these have broken things for me as well. It's weird semantic
versioning is still not followed clearly for such popular libraries.

All Commons RDF modules are OSGi bundles, but this would also need to be
tested more. Do you have any recommendations for what is a good framework
for such integration tests? I think in jena-osgi we used Felix with a Maven
plugin to help run junit tests within OSGi.

> It has worked fairly well for JSONLD-Java so far, which has stabilised
> most of its API for now.

BTW, would you be OK to have a look at the JSONLD-Java module and see if I
am keeping it within the API boundaries? The RDFDataset in jsonld-java
subclasses HashMap, and I had to use that for a couple of the calls (e.g.
deleting quads), which I feel is intruding on implementation details.

(Also I found no methods for adding an existing Quad object, is that on
purpose or a missing feature?)

> RDF4J didn't choose to use the EPL, they are using a BSD-style license
> that Eclipse also support, but the rest of the Eclipse legal
> procedures for contributions are still being used

Brilliant, that should mean we could in theory include RDF4J code/jars
under the Apache license (with appropriate notices).
Reply | Threaded
Open this post in threaded view
|

Re: [RDF] jena, rdf4j, json-ld integrations

Peter Ansell
On 15 September 2016 at 18:34, Stian Soiland-Reyes <[hidden email]> wrote:

> On 15 Sep 2016 5:11 a.m., "Peter Ansell" <[hidden email]> wrote:
>> One of the original goals was to help with migration and
>> interoperability so if it doesn't then things would need to be
>> reworked on the Commons RDF side to support that.
>
> I would hope it does that now :-)
>
>> The main dependencies that are shared and liable to break are the
>> FasterXML Jackson and Apache HttpClient dependencies that both
>> semi-regularly break their public APIs at the minor version level and
>> sometimes at the patch level. In the long term you would need to
>> isolate RDF4J and Sesame with OSGi or Java-9 modules/etc. to keep them
>> playing nice together.
>
> Yes, both of these have broken things for me as well. It's weird semantic
> versioning is still not followed clearly for such popular libraries.

Jackson are a bit complex with their policy. They haven't defined
their public API at this point so anything could be removed and even
public API methods that have been deprecated for two minor versions
can be removed.

https://github.com/FasterXML/jackson/wiki/Jackson-Releases

Apache HttpComponents appear to refer in their release procedures to
verifying compatibility with the "baseline release". That seems to
mean minor version as the example has xx.yy.00, implying you go back
to the first patch version for a minor version and check that you are
compatible with it. However, I have still seen cases in the past for
the 4.x series where patch versions within a minor version are
incompatible so there are still issues with it in practice.

https://wiki.apache.org/HttpComponents/HttpComponentsReleaseProcess

> All Commons RDF modules are OSGi bundles, but this would also need to be
> tested more. Do you have any recommendations for what is a good framework
> for such integration tests? I think in jena-osgi we used Felix with a Maven
> plugin to help run junit tests within OSGi.

I don't use OSGi myself. I have been running around recently
submitting pull requests to get upstream Sesame libraries to support
RDF4J now that it is out, so I can upgrade my toolchain to RDF4J.

>> It has worked fairly well for JSONLD-Java so far, which has stabilised
>> most of its API for now.
>
> BTW, would you be OK to have a look at the JSONLD-Java module and see if I
> am keeping it within the API boundaries? The RDFDataset in jsonld-java
> subclasses HashMap, and I had to use that for a couple of the calls (e.g.
> deleting quads), which I feel is intruding on implementation details.

I will have a look at it.

> (Also I found no methods for adding an existing Quad object, is that on
> purpose or a missing feature?)

I will look into that as RDFDataset should support quads, but may not
have a method for insertion right now.

>> RDF4J didn't choose to use the EPL, they are using a BSD-style license
>> that Eclipse also support, but the rest of the Eclipse legal
>> procedures for contributions are still being used
>
> Brilliant, that should mean we could in theory include RDF4J code/jars
> under the Apache license (with appropriate notices).

You would need to email the rdf4j-dev or one of the other eclipse
mailing lists to get more advice about legal aspects.

Cheers,

Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]