New Sub-project Proposal.

classic Classic list List threaded Threaded
46 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Re: New Sub-project Proposal.

Claude Warren
> At first sight, I'd say that serialization is out-of-scope (we
> should let application developers deal with that using the
> available accessors).

How does one serialize a bloom filter if to do so you need to implement the
private Object writeReplace() method?  A list of proto bloom filters is not
serializable, actually bloom filters themselves should be serializable.
How does one accomplish that if serialization is left to later?

Claude

On Mon, Sep 23, 2019 at 8:00 PM Gilles Sadowski <[hidden email]>
wrote:

> Hi.
>
> Le lun. 23 sept. 2019 à 12:59, Claude Warren <[hidden email]> a écrit :
> >
> > I will rework to remove the package private and other access issues
> noted.
> >
> > In Builder there is a difference between with() and build().  This
> follows
> > the pattern established by MessageDigest[1] where it is possible to
> build a
> > digest in one call or by adding multiple items and then calling digest.
> > The differences here are digest has an update() method the builder uses
> > with(), digest uses a digest() method to return the result builder uses
> > build().
>
> I noted the difference but still think it better to leave out the all
> the "build" methods except the argument-less "build()".  IMHO,
> saving the one call to the latter is not worth the duplication.
>
> > The serializable classes are classes that in the indexing code are sent
> > across the wire or stored.
>
> At first sight, I'd say that serialization is out-of-scope (we
> should let application developers deal with that using the
> available accessors).
>
> > I suppose there are other ways to do this and
> > the serialization could be removed.  Not a direction I would like to take
> > but is doable.
> >
> > HammingWeight and Log values re used in indexing.  I think HammingWeight
> is
> > used in the collections as well.  HammingWeight is a well known property
> of
> > a bloom filter.
>
> Why is it not computed unconditionally at instantiation?
>
> >  The Log not so much so.  I suppose the log could be
> > removed and created as a separate library method that would execute
> against
> > BitSets.  Moving the calculation outside of the Bloom filter means that
> we
> > will have to make copies of bloom filters to perform the log and the log
> > will not be retained with the filter when used in indexing.
>
> In both cases, a method could return (but not store) the values; up
> to the application developers to manage them according to their
> needs (?).
>
> >
> > Murmer 128 hash is not in Commons Codec (at least I could not find it).
> > Perhaps it should be added?
>
> It seems so.  Better ask in another post (with [Codec] as the "Subject:"
> prefix).
>
> >  I could extract the code and provide it there
> > if that is the appropriate solution.
> >
> > ProtoBloomFilter is not an abbreviation of PrototypeBloomFilter as it is
> > not a Bloom filter but a primitive object from which Bloom filters may be
> > created.  I think this is a proper use of the Proto[2] prefix.
> >
> > Package name plural.  I thought there was some disagreement on this,
> > however I will change it.
> >
> > I will revert the FindBugs config change as it is now commented out
> > anyway.  How does one suppress the fall through checkstyle error.  I
> could
> > find no annotations that would suppress the error and did not want to add
> > another dependency to the build.  I did find that other components in the
> > project are using the checkstyle filter and so added the supression of
> the
> > FallThrough check.
>
> I think that the comment
> ---CUT---
> // fallthru
> ---CUT---
> should do the trick.
>
> >
> > The reference to what is a Bloom filter is in
> > bloomfilters/package-info.java.
>
> OK, then.  Sorry for the false positive. ;-)
>
> >  Does there need to be a reference from the
> > BloomFilter interface definition to that?
> >
> > Thank you for taking the time to review and comment.
>
> Thank you for your contribution,
> Gilles
>
> >
> > Claude
> >
> > [1]
> >
> https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/security/MessageDigest.html
> > [2] https://www.dictionary.com/browse/proto?s=t
> >
> > On Mon, Sep 23, 2019 at 11:13 AM Claude Warren <[hidden email]> wrote:
> >
> > > For the style issues.... is there an Eclipse style package that meets
> the
> > > commons style or some other tool that will correctly configure the
> format
> > > and style options in Eclipse?
> > >
> > > On Mon, Sep 23, 2019 at 10:54 AM Gilles Sadowski <[hidden email]
> >
> > > wrote:
> > >
> > >> Hello.
> > >>
> > >> Here are a few comment from a quick browse of today's update
> > >> of PR #83.
> > >>
> > >> * "package private for testing" is not a good reason (IMO)
> > >> * There are spurious blank spaces in some of the file ("git diff"
> > >> shows them in red)
> > >> * You should always perform "git rebase master"
> > >> * In "Builder":
> > >> ** Field "hashes" should be "final"
> > >> ** I still fail to see the necessity to duplicate "with" functionality
> > >> with "build" methods
> > >> * Why should so many classes be "Serializable"?
> > >> * "StandardBloomFilter" is documented as "immutable" but it is not:
> > >> There are 2 non "final" fields and 1 "final" but "protected" (hence
> > >> it can be modified outside the class without ensuring its invariants)
> > >> * IMHO use of transient fields is detrimental to the robustness of the
> > >> design: It seems that "hamming weight" is not _used_ by the
> > >> implementations (called only in the unit tests?)
> > >> * Ditto for "logValue"
> > >> * Did you check whether "murmur 128 hash" is available in "Commons
> > >> Codec"?
> > >> * Name of classes should not be abbreviated (cf. "Config", "Proto",
> > >> "Stats" ...)
> > >> * Package name is still plural: Should be singular IMO
> > >> * Rule checks general suppression should be removed in the config
> > >> files (CheckStyle, FindBugs) and replace by specific (within code)
> > >> statements
> > >> if applicable (e.g. the disabling of "no default in switch" is
> unnecessary
> > >> since AFAICT a "default" is always provided).
> > >> * All (also "private") fields and methods must be commented with
> Javadoc,
> > >> not just code comment.
> > >> * There should be a reference to what is a Bloom filter
> > >> * Occurrences of "bloom" should be replaced with "Bloom"
> > >> * There some style issues:
> > >> ** There should be no space after an opening parenthesis, or before
> > >> a closing one
> > >> ** Curly braces position: opening on the same line, closing on a
> > >> separate line
> > >>
> > >> Best regards,
> > >> Gilles
> > >>
> > >> ---------------------------------------------------------------------
> > >> To unsubscribe, e-mail: [hidden email]
> > >> For additional commands, e-mail: [hidden email]
> > >>
> > >>
> > >
> > > --
> > > I like: Like Like - The likeliest place on the web
> > > <http://like-like.xenei.com>
> > > LinkedIn: http://www.linkedin.com/in/claudewarren
> > >
> >
> >
> > --
> > I like: Like Like - The likeliest place on the web
> > <http://like-like.xenei.com>
> > LinkedIn: http://www.linkedin.com/in/claudewarren
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>

--
I like: Like Like - The likeliest place on the web
<http://like-like.xenei.com>
LinkedIn: http://www.linkedin.com/in/claudewarren
Reply | Threaded
Open this post in threaded view
|

Re: New Sub-project Proposal.

sebb-2-2
On Mon, 23 Sep 2019 at 20:18, Claude Warren <[hidden email]> wrote:
>
> > At first sight, I'd say that serialization is out-of-scope (we
> > should let application developers deal with that using the
> > available accessors).
>
> How does one serialize a bloom filter if to do so you need to implement the
> private Object writeReplace() method?  A list of proto bloom filters is not
> serializable, actually bloom filters themselves should be serializable.
> How does one accomplish that if serialization is left to later?

Application code must be able to construct bloom filters.
To do this presumably requires some data items, so all the app has to
do is serialise the data required to construct the filter.

> Claude
>
> On Mon, Sep 23, 2019 at 8:00 PM Gilles Sadowski <[hidden email]>
> wrote:
>
> > Hi.
> >
> > Le lun. 23 sept. 2019 à 12:59, Claude Warren <[hidden email]> a écrit :
> > >
> > > I will rework to remove the package private and other access issues
> > noted.
> > >
> > > In Builder there is a difference between with() and build().  This
> > follows
> > > the pattern established by MessageDigest[1] where it is possible to
> > build a
> > > digest in one call or by adding multiple items and then calling digest.
> > > The differences here are digest has an update() method the builder uses
> > > with(), digest uses a digest() method to return the result builder uses
> > > build().
> >
> > I noted the difference but still think it better to leave out the all
> > the "build" methods except the argument-less "build()".  IMHO,
> > saving the one call to the latter is not worth the duplication.
> >
> > > The serializable classes are classes that in the indexing code are sent
> > > across the wire or stored.
> >
> > At first sight, I'd say that serialization is out-of-scope (we
> > should let application developers deal with that using the
> > available accessors).
> >
> > > I suppose there are other ways to do this and
> > > the serialization could be removed.  Not a direction I would like to take
> > > but is doable.
> > >
> > > HammingWeight and Log values re used in indexing.  I think HammingWeight
> > is
> > > used in the collections as well.  HammingWeight is a well known property
> > of
> > > a bloom filter.
> >
> > Why is it not computed unconditionally at instantiation?
> >
> > >  The Log not so much so.  I suppose the log could be
> > > removed and created as a separate library method that would execute
> > against
> > > BitSets.  Moving the calculation outside of the Bloom filter means that
> > we
> > > will have to make copies of bloom filters to perform the log and the log
> > > will not be retained with the filter when used in indexing.
> >
> > In both cases, a method could return (but not store) the values; up
> > to the application developers to manage them according to their
> > needs (?).
> >
> > >
> > > Murmer 128 hash is not in Commons Codec (at least I could not find it).
> > > Perhaps it should be added?
> >
> > It seems so.  Better ask in another post (with [Codec] as the "Subject:"
> > prefix).
> >
> > >  I could extract the code and provide it there
> > > if that is the appropriate solution.
> > >
> > > ProtoBloomFilter is not an abbreviation of PrototypeBloomFilter as it is
> > > not a Bloom filter but a primitive object from which Bloom filters may be
> > > created.  I think this is a proper use of the Proto[2] prefix.
> > >
> > > Package name plural.  I thought there was some disagreement on this,
> > > however I will change it.
> > >
> > > I will revert the FindBugs config change as it is now commented out
> > > anyway.  How does one suppress the fall through checkstyle error.  I
> > could
> > > find no annotations that would suppress the error and did not want to add
> > > another dependency to the build.  I did find that other components in the
> > > project are using the checkstyle filter and so added the supression of
> > the
> > > FallThrough check.
> >
> > I think that the comment
> > ---CUT---
> > // fallthru
> > ---CUT---
> > should do the trick.
> >
> > >
> > > The reference to what is a Bloom filter is in
> > > bloomfilters/package-info.java.
> >
> > OK, then.  Sorry for the false positive. ;-)
> >
> > >  Does there need to be a reference from the
> > > BloomFilter interface definition to that?
> > >
> > > Thank you for taking the time to review and comment.
> >
> > Thank you for your contribution,
> > Gilles
> >
> > >
> > > Claude
> > >
> > > [1]
> > >
> > https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/security/MessageDigest.html
> > > [2] https://www.dictionary.com/browse/proto?s=t
> > >
> > > On Mon, Sep 23, 2019 at 11:13 AM Claude Warren <[hidden email]> wrote:
> > >
> > > > For the style issues.... is there an Eclipse style package that meets
> > the
> > > > commons style or some other tool that will correctly configure the
> > format
> > > > and style options in Eclipse?
> > > >
> > > > On Mon, Sep 23, 2019 at 10:54 AM Gilles Sadowski <[hidden email]
> > >
> > > > wrote:
> > > >
> > > >> Hello.
> > > >>
> > > >> Here are a few comment from a quick browse of today's update
> > > >> of PR #83.
> > > >>
> > > >> * "package private for testing" is not a good reason (IMO)
> > > >> * There are spurious blank spaces in some of the file ("git diff"
> > > >> shows them in red)
> > > >> * You should always perform "git rebase master"
> > > >> * In "Builder":
> > > >> ** Field "hashes" should be "final"
> > > >> ** I still fail to see the necessity to duplicate "with" functionality
> > > >> with "build" methods
> > > >> * Why should so many classes be "Serializable"?
> > > >> * "StandardBloomFilter" is documented as "immutable" but it is not:
> > > >> There are 2 non "final" fields and 1 "final" but "protected" (hence
> > > >> it can be modified outside the class without ensuring its invariants)
> > > >> * IMHO use of transient fields is detrimental to the robustness of the
> > > >> design: It seems that "hamming weight" is not _used_ by the
> > > >> implementations (called only in the unit tests?)
> > > >> * Ditto for "logValue"
> > > >> * Did you check whether "murmur 128 hash" is available in "Commons
> > > >> Codec"?
> > > >> * Name of classes should not be abbreviated (cf. "Config", "Proto",
> > > >> "Stats" ...)
> > > >> * Package name is still plural: Should be singular IMO
> > > >> * Rule checks general suppression should be removed in the config
> > > >> files (CheckStyle, FindBugs) and replace by specific (within code)
> > > >> statements
> > > >> if applicable (e.g. the disabling of "no default in switch" is
> > unnecessary
> > > >> since AFAICT a "default" is always provided).
> > > >> * All (also "private") fields and methods must be commented with
> > Javadoc,
> > > >> not just code comment.
> > > >> * There should be a reference to what is a Bloom filter
> > > >> * Occurrences of "bloom" should be replaced with "Bloom"
> > > >> * There some style issues:
> > > >> ** There should be no space after an opening parenthesis, or before
> > > >> a closing one
> > > >> ** Curly braces position: opening on the same line, closing on a
> > > >> separate line
> > > >>
> > > >> Best regards,
> > > >> Gilles
> > > >>
> > > >> ---------------------------------------------------------------------
> > > >> To unsubscribe, e-mail: [hidden email]
> > > >> For additional commands, e-mail: [hidden email]
> > > >>
> > > >>
> > > >
> > > > --
> > > > I like: Like Like - The likeliest place on the web
> > > > <http://like-like.xenei.com>
> > > > LinkedIn: http://www.linkedin.com/in/claudewarren
> > > >
> > >
> > >
> > > --
> > > I like: Like Like - The likeliest place on the web
> > > <http://like-like.xenei.com>
> > > LinkedIn: http://www.linkedin.com/in/claudewarren
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [hidden email]
> > For additional commands, e-mail: [hidden email]
> >
> >
>
> --
> I like: Like Like - The likeliest place on the web
> <http://like-like.xenei.com>
> LinkedIn: http://www.linkedin.com/in/claudewarren

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: New Sub-project Proposal.

Gilles Sadowski-2
Hi.

Is there a JIRA report associated with the proposal?

It would help review if there were several PRs that
differentiates between "core" functionality ("BloomFilter"),
with minimal API (vs "syntactic sugar"), and higher-level
utilities ("FilterConfiguration", "GatedCollection", etc.).

Regards,
Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: New Sub-project Proposal.

Claude Warren
There is no associated JIRA report.  Not even sure how to generate or build
one.

Claude

On Wed, Sep 25, 2019 at 5:42 PM Gilles Sadowski <[hidden email]>
wrote:

> Hi.
>
> Is there a JIRA report associated with the proposal?
>
> It would help review if there were several PRs that
> differentiates between "core" functionality ("BloomFilter"),
> with minimal API (vs "syntactic sugar"), and higher-level
> utilities ("FilterConfiguration", "GatedCollection", etc.).
>
> Regards,
> Gilles
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>

--
I like: Like Like - The likeliest place on the web
<http://like-like.xenei.com>
LinkedIn: http://www.linkedin.com/in/claudewarren
Reply | Threaded
Open this post in threaded view
|

Re: New Sub-project Proposal.

Gilles Sadowski-2
Hello.

2019-09-25 19:13 UTC+02:00, Claude Warren <[hidden email]>:
> There is no associated JIRA report.  Not even sure how to generate or build
> one.

Issue-tracking system (for "Commons Collections") is there:
   https://issues.apache.org/jira/projects/COLLECTIONS

Gilles

> Claude
>
> On Wed, Sep 25, 2019 at 5:42 PM Gilles Sadowski <[hidden email]>
> wrote:
>
>> Hi.
>>
>> Is there a JIRA report associated with the proposal?
>>
>> It would help review if there were several PRs that
>> differentiates between "core" functionality ("BloomFilter"),
>> with minimal API (vs "syntactic sugar"), and higher-level
>> utilities ("FilterConfiguration", "GatedCollection", etc.).
>>
>> Regards,
>> Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: New Sub-project Proposal.

Claude Warren
Created  COLLECTIONS-728 (
https://issues.apache.org/jira/browse/COLLECTIONS-728).  I hope that was
appropriate.

On Wed, Sep 25, 2019 at 6:41 PM Gilles Sadowski <[hidden email]>
wrote:

> Hello.
>
> 2019-09-25 19:13 UTC+02:00, Claude Warren <[hidden email]>:
> > There is no associated JIRA report.  Not even sure how to generate or
> build
> > one.
>
> Issue-tracking system (for "Commons Collections") is there:
>    https://issues.apache.org/jira/projects/COLLECTIONS
>
> Gilles
>
> > Claude
> >
> > On Wed, Sep 25, 2019 at 5:42 PM Gilles Sadowski <[hidden email]>
> > wrote:
> >
> >> Hi.
> >>
> >> Is there a JIRA report associated with the proposal?
> >>
> >> It would help review if there were several PRs that
> >> differentiates between "core" functionality ("BloomFilter"),
> >> with minimal API (vs "syntactic sugar"), and higher-level
> >> utilities ("FilterConfiguration", "GatedCollection", etc.).
> >>
> >> Regards,
> >> Gilles
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>

--
I like: Like Like - The likeliest place on the web
<http://like-like.xenei.com>
LinkedIn: http://www.linkedin.com/in/claudewarren
123