[DBCP] Connection pool not behaving as I expect

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

[DBCP] Connection pool not behaving as I expect

Shawn Heisey
We have been having some problems lately where our MySQL server hits the
max connection limit (600) and then everything breaks.  When I look into
the problem, I find that our application servers have each made nearly a
hundred connections to the DB and haven't closed any of them for hours.

I'm also using connection pooling in my programs, with the latest DBCP
version.  Those servers don't open nearly as many connections, and have
idle eviction to keep the connection count down.  But when the limit is
reached, these programs suddenly stop working too.

Investigating these problems, I manage to get connected and kill off the
surplus of idle connections, and everything starts working.

Today, a couple of days after the last incident, I realized that we
should *NOT* be having these problems -- because we're using connection
pooling.  The application has open and idle connections to the DB server
... so why is trying to open MORE connections (and obviously failing)
instead of using one of the perfectly good connections that's already
sitting there, unused?

I'm writing here specifically for DBCP on my programs, so I know you
guys probably can't help with Tomcat's connection pooling ... but for
either case my question stands:  Why isn't connection pooling doing its job?

Thanks,
Shawn


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [DBCP] Connection pool not behaving as I expect

garydgregory
On Thu, Mar 1, 2018 at 3:33 PM, Shawn Heisey <[hidden email]> wrote:

> We have been having some problems lately where our MySQL server hits the
> max connection limit (600) and then everything breaks.  When I look into
> the problem, I find that our application servers have each made nearly a
> hundred connections to the DB and haven't closed any of them for hours.
>
> I'm also using connection pooling in my programs, with the latest DBCP
> version.  Those servers don't open nearly as many connections, and have
> idle eviction to keep the connection count down.  But when the limit is
> reached, these programs suddenly stop working too.
>
> Investigating these problems, I manage to get connected and kill off the
> surplus of idle connections, and everything starts working.
>
> Today, a couple of days after the last incident, I realized that we
> should *NOT* be having these problems -- because we're using connection
> pooling.  The application has open and idle connections to the DB server
> ... so why is trying to open MORE connections (and obviously failing)
> instead of using one of the perfectly good connections that's already
> sitting there, unused?
>
> I'm writing here specifically for DBCP on my programs, so I know you
> guys probably can't help with Tomcat's connection pooling ... but for
> either case my question stands:  Why isn't connection pooling doing its
> job?
>

I do not think this is a question I, or anyone here, can answer
generically. I can read between that lines that you must feel frustrated
and I certainly empathize with that. I think you might want to debug your
application and come up with some parameters for us to start helping you. A
reproducible example is always best but I understand it might be hard to
provide in this particular case.

Gary


> Thanks,
> Shawn
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DBCP] Connection pool not behaving as I expect

Phil Steitz
In reply to this post by Shawn Heisey


> On Mar 1, 2018, at 3:33 PM, Shawn Heisey <[hidden email]> wrote:
>
> We have been having some problems lately where our MySQL server hits the
> max connection limit (600) and then everything breaks.  When I look into
> the problem, I find that our application servers have each made nearly a
> hundred connections to the DB and haven't closed any of them for hours.
>
> I'm also using connection pooling in my programs, with the latest DBCP
> version.  Those servers don't open nearly as many connections, and have
> idle eviction to keep the connection count down.  But when the limit is
> reached, these programs suddenly stop working too.
>
> Investigating these problems, I manage to get connected and kill off the
> surplus of idle connections, and everything starts working.
>
> Today, a couple of days after the last incident, I realized that we
> should *NOT* be having these problems -- because we're using connection
> pooling.  The application has open and idle connections to the DB server
> ... so why is trying to open MORE connections (and obviously failing)
> instead of using one of the perfectly good connections that's already
> sitting there, unused?
>
> I'm writing here specifically for DBCP on my programs, so I know you
> guys probably can't help with Tomcat's connection pooling ... but for
> either case my question stands:  Why isn't connection pooling doing its job?

Best to start by posting your pool config.  One thing to bear in mind is that “idle” from The dB perspective does not mean the same thing as idle from the pool’s perspective.  In addition to genuinely abandoned connections, another way that applications can hog dB connections is to check them out and hold them for a long time while not using them.  On the dB engine side these will show as idle, but as they are checked out to clients they are not available in the pool.  When you are having the problem, what does dB one report as numidle?

Phil
>
> Thanks,
> Shawn
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [DBCP] Connection pool not behaving as I expect

Shawn Heisey
In reply to this post by garydgregory
On 3/1/2018 4:46 PM, Gary Gregory wrote:
> I do not think this is a question I, or anyone here, can answer
> generically. I can read between that lines that you must feel frustrated
> and I certainly empathize with that. I think you might want to debug your
> application and come up with some parameters for us to start helping you. A
> reproducible example is always best but I understand it might be hard to
> provide in this particular case.

There is a lot of frustration.  Until today all of it was directed at
our developers, for creating programs and configs that make way too many
connections to the DB.

But then today, I had that small eureka moment, thinking "wait a minute
... how can this even be happening at all, if the connection pool has
connections that the DB server says are active and idle?"

Reiterating something I said before: I know you can't help me with the
pools that the Tomcat servers are creating for our webapps.  So I'll
limit the rest of the discussion to my own program, which uses DBCP, and
has the same problems.

Please tell me what information you'd like me to provide. Anything that
is in my power, I will get it to you.

This is how I set up DBCP in my code:

   /*
    * Create a datasource (connection pool) for the master database server.
    */
   ConnectionFactory cfMaster = new
DriverManagerConnectionFactory(masterUrl, dbUser, dbPass);
   PoolableConnectionFactory pcfMaster = new
PoolableConnectionFactory(cfMaster, null);
   pcfMaster.setValidationQuery(validationQuery);
   pcfMaster.setValidationQueryTimeout(Const.FIVE_SECONDS / 1000);
   opMaster = new GenericObjectPool<>(pcfMaster);
   opMaster.setMaxWaitMillis(Const.THIRTY_SECONDS);
   opMaster.setMaxIdle(numShards);
   opMaster.setMaxTotal(numShards * 5);
   opMaster.setNumTestsPerEvictionRun(numShards * 5);
opMaster.setTimeBetweenEvictionRunsMillis(Const.FIVE_SECONDS);
   opMaster.setMinEvictableIdleTimeMillis(Const.ONE_MINUTE * 5);
   opMaster.setTestOnCreate(true);
   opMaster.setTestOnBorrow(true);
   opMaster.setTestOnReturn(true);
   opMaster.setTestWhileIdle(true);
   pcfMaster.setPool(opMaster);
   dsMaster = new PoolingDataSource<>(opMaster);

The JDBC driver we use is MySQL.  As of a few weeks ago, it was the
newest stable version available, 5.1.something.  Also at that time, I
was using the latest DBCP and POOL versions.  If any new versions have
come out very recently, I probably don't have them yet.

Typically the numShards value we're using is 6, to help with
understanding the code above.

Observations: When the MySQL server has reached its connection limit, at
least one of the idle connections is from this program using DBCP.  But
when the program attempts to use the DB, it gets the "Too many
connections" error response -- which means that it must be opening a
brand new connection, despite the fact that there SHOULD be at least one
that is ready and sitting in the pool.

The code that uses the DB is basic JDBC code.  It calls getConnection()
on the dataSource, verifies that the connection is valid, creates a
statement, executes it, and if it was a query, processes the resultset. 
Then it closes any resultset, closes the statement, and closes the
connection.  As I understand it, that close should return the connection
to the pool, still open, and ready for re-use.  This all happens within
a single thread.  I went through this code pretty closely for another
issue on this mailing list.  It's possible that I missed something, but
it looks very clean.

I was going to add some debug logging to my code, but I can't see any
way with PoolingDataSource to get the number of active and idle
connections, just to make SURE that the pool really has what I think it
does.

I have a code change ready to switch everything to BasicDataSource and
add the debug logging.  It's generally less verbose code, and looks to
be just as configurable as PoolingDataSource.  Would that change be a
good idea?

Thanks,
Shawn


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [DBCP] Connection pool not behaving as I expect

Matt Sicker
Take a look inside commons-pool for the instrumentation (e.g., JMX). You
can also track usage on borrow and other leaks.

Also, Tomcat uses DBCP as it is.

On 1 March 2018 at 21:46, Shawn Heisey <[hidden email]> wrote:

> On 3/1/2018 4:46 PM, Gary Gregory wrote:
>
>> I do not think this is a question I, or anyone here, can answer
>> generically. I can read between that lines that you must feel frustrated
>> and I certainly empathize with that. I think you might want to debug your
>> application and come up with some parameters for us to start helping you.
>> A
>> reproducible example is always best but I understand it might be hard to
>> provide in this particular case.
>>
>
> There is a lot of frustration.  Until today all of it was directed at our
> developers, for creating programs and configs that make way too many
> connections to the DB.
>
> But then today, I had that small eureka moment, thinking "wait a minute
> ... how can this even be happening at all, if the connection pool has
> connections that the DB server says are active and idle?"
>
> Reiterating something I said before: I know you can't help me with the
> pools that the Tomcat servers are creating for our webapps.  So I'll limit
> the rest of the discussion to my own program, which uses DBCP, and has the
> same problems.
>
> Please tell me what information you'd like me to provide. Anything that is
> in my power, I will get it to you.
>
> This is how I set up DBCP in my code:
>
>   /*
>    * Create a datasource (connection pool) for the master database server.
>    */
>   ConnectionFactory cfMaster = new DriverManagerConnectionFactory(masterUrl,
> dbUser, dbPass);
>   PoolableConnectionFactory pcfMaster = new PoolableConnectionFactory(cfMaster,
> null);
>   pcfMaster.setValidationQuery(validationQuery);
>   pcfMaster.setValidationQueryTimeout(Const.FIVE_SECONDS / 1000);
>   opMaster = new GenericObjectPool<>(pcfMaster);
>   opMaster.setMaxWaitMillis(Const.THIRTY_SECONDS);
>   opMaster.setMaxIdle(numShards);
>   opMaster.setMaxTotal(numShards * 5);
>   opMaster.setNumTestsPerEvictionRun(numShards * 5);
> opMaster.setTimeBetweenEvictionRunsMillis(Const.FIVE_SECONDS);
>   opMaster.setMinEvictableIdleTimeMillis(Const.ONE_MINUTE * 5);
>   opMaster.setTestOnCreate(true);
>   opMaster.setTestOnBorrow(true);
>   opMaster.setTestOnReturn(true);
>   opMaster.setTestWhileIdle(true);
>   pcfMaster.setPool(opMaster);
>   dsMaster = new PoolingDataSource<>(opMaster);
>
> The JDBC driver we use is MySQL.  As of a few weeks ago, it was the newest
> stable version available, 5.1.something.  Also at that time, I was using
> the latest DBCP and POOL versions.  If any new versions have come out very
> recently, I probably don't have them yet.
>
> Typically the numShards value we're using is 6, to help with understanding
> the code above.
>
> Observations: When the MySQL server has reached its connection limit, at
> least one of the idle connections is from this program using DBCP.  But
> when the program attempts to use the DB, it gets the "Too many connections"
> error response -- which means that it must be opening a brand new
> connection, despite the fact that there SHOULD be at least one that is
> ready and sitting in the pool.
>
> The code that uses the DB is basic JDBC code.  It calls getConnection() on
> the dataSource, verifies that the connection is valid, creates a statement,
> executes it, and if it was a query, processes the resultset.  Then it
> closes any resultset, closes the statement, and closes the connection.  As
> I understand it, that close should return the connection to the pool, still
> open, and ready for re-use.  This all happens within a single thread.  I
> went through this code pretty closely for another issue on this mailing
> list.  It's possible that I missed something, but it looks very clean.
>
> I was going to add some debug logging to my code, but I can't see any way
> with PoolingDataSource to get the number of active and idle connections,
> just to make SURE that the pool really has what I think it does.
>
> I have a code change ready to switch everything to BasicDataSource and add
> the debug logging.  It's generally less verbose code, and looks to be just
> as configurable as PoolingDataSource.  Would that change be a good idea?
>
>
> Thanks,
> Shawn
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>


--
Matt Sicker <[hidden email]>
Reply | Threaded
Open this post in threaded view
|

Re: [DBCP] Connection pool not behaving as I expect

Shawn Heisey
On 3/1/2018 8:48 PM, Matt Sicker wrote:
> Take a look inside commons-pool for the instrumentation (e.g., JMX). You
> can also track usage on borrow and other leaks.
>
> Also, Tomcat uses DBCP as it is.

I know I can get those numbers from the object pool, but at the point
where I needed them, I don't actually know *which* DataSource is being
used.  It could be one of two.  So I don't know which object pool to
look at.

Interim solution before I switch to BasicDataSource:  Log info from BOTH
pools.  I did this, and here's some info that gets logged on the dev
server where I installed the change:

WARN  - 2018-03-01 21:59:00.313;   371; c; CP: main-active=0,
main-idle=1, master-active=0, master-idle=1
WARN  - 2018-03-01 21:59:00.314;   371; c; CP: main-active=0,
main-idle=1, master-active=0, master-idle=1
WARN  - 2018-03-01 21:59:00.315;   371; c; CP: main-active=0,
main-idle=1, master-active=0, master-idle=1
WARN  - 2018-03-01 21:59:00.316;   371; c; CP: main-active=0,
main-idle=1, master-active=0, master-idle=1
WARN  - 2018-03-01 21:59:00.318;   371; c; CP: main-active=0,
main-idle=1, master-active=0, master-idle=1
WARN  - 2018-03-01 21:59:00.319;   371; c; CP: main-active=0,
main-idle=1, master-active=0, master-idle=1
WARN  - 2018-03-01 21:59:00.320;   371; c; CP: main-active=0,
main-idle=1, master-active=0, master-idle=1
WARN  - 2018-03-01 21:59:00.321;   371; c; CP: main-active=0,
main-idle=1, master-active=0, master-idle=1
WARN  - 2018-03-01 21:59:00.322;   371; c; CP: main-active=0,
main-idle=1, master-active=0, master-idle=1
WARN  - 2018-03-01 21:59:00.323;   371; c; CP: main-active=0,
main-idle=1, master-active=0, master-idle=1
WARN  - 2018-03-01 21:59:00.324;   371; c; CP: main-active=0,
main-idle=1, master-active=0, master-idle=1
WARN  - 2018-03-01 21:59:00.325;   371; c; CP: main-active=0,
main-idle=1, master-active=0, master-idle=1

This logging happens just before a connection is obtained from the
pool.  The info logged is the same -- zero active, one or more idle. 
Except during program startup, when there are some lines where both are
zero.  And once an hour, there is a background thread that does a query
that takes nearly a minute to run, and while that's happening, I do see
main-active=1.

I did just now think of one possibility that might explain why this
program misbehaves when the problem happens, even if DBCP is working
right.  Based on the logging I've enabled, and the fact that nobody has
said "oh, we see that all the time, and the problem is probably X" ... I
think DBCP probably is working right.

====
Basic info about my program: All reads are done via the "main" pool,
which connects to a slave, unless connections there are not working,
then reads switch to the "master" pool for the next 1000 connections. 
All writes are done via the "master" pool.

When the master server reaches its connection limit, a completely
separate process that adds information to the monster table in the DB
cannot work.  It's not Java-based, and doesn't have connection pooling
available.

When there are no new docs in the monster table, my program doesn't have
anything to do, so it doesn't need to make any changes to its control
table -- it's not going to be using the master pool.  That probably
results in the idle connection to the master server being evicted five
minutes after the additions to the database stop.  Then because there's
now a connection available on the server, the other system can suddenly
add some documents.  So my program notices the new docs via its main
pool, processes them, and then it would need to update its control table
on the master server.  There's no idle connection because it got
evicted, so it tries to make a new one, and we get an explosion.
====

I still do have the separate problem of why our app servers explode with
"Too many connections" exceptions, when the DB pools there have dozens
of active connections.  I didn't write that code.  Maybe their DB access
(which is primarily through hibernate, a library I don't know much
about) is not properly releasing connections back to the pool, so the
pool does not think they're actually idle.  I will need to see what
logging they can add.

Thanks,
Shawn


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]