[jira] [Created] (DAEMON-288) Hang while stopping procrun service

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (DAEMON-288) Hang while stopping procrun service

ASF GitHub Bot (Jira)
Mike Miller created DAEMON-288:
----------------------------------

             Summary: Hang while stopping procrun service
                 Key: DAEMON-288
                 URL: https://issues.apache.org/jira/browse/DAEMON-288
             Project: Commons Daemon
          Issue Type: Bug
          Components: Procrun
    Affects Versions: 1.0.13
         Environment: Windows 7 64 bit
            Reporter: Mike Miller


There is a hang of the procrun service while it is attempting to stop.  It is not easy to reproduce ( 30%-5% depending on pc ).  Using a debugging to analyze the hang, both the serviceMain() and serviceStop() threads appear to have run and exited.  I can tell this from the state of the global variables like gSargs and gShutdownEvents.  Looking at the code, both are calling reportServiceStatus( SERVICE_STOPPED...).  Typically when either one reports SERVICE_STOPPED, the main thread unblocks and the process terminates.  This often occurs without both threads running to completion.  I think this is a race condition caused by the reportServiceStatus() usage.  The MSDN documentation for SetServiceStatus() states to only call SetServiceStatus() with SERVICE_STOPPED after all cleanup has occurred and to only call it once.  It appears that procrun has a race condition where 2 threads will both attempt to report SERVICE_STOPPED and will likely report this while the other thread is still running.  I believe this is the root cause of why the Service Control Manager sometimes is unable to stop the service.  
 
As a potential solution, I modified serviceStop() to not call reportServiceStatus(SERVICE_STOPPED...) and to move the SetEvent( gShutdownEvent) to the end of the method.  This change allows the thread running the  serviceStop() to complete.  Now the only method reporting stopped is when serviceMain() exits.  With this refactoring to only report SERVICE_STOPPED once (per MSDN) the hang has not been reproducible.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira