Uploaded image for project: 'Grouper'
  1. Grouper
  2. GRP-1959

"FullSync timed out after 300 seconds" error

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • 2.4.0.patch
    • None
    • provisioning
    • None

    Description

      Parameter fullSyncTimeout_secs, used in scheduleAndAwaitFullSyncOfGroup, is hard-coded to 5 minutes, while parameter coordinationTimeout_secs, used in lockForFullSyncWhenNoIncrementalIsUnderway and lockForIncrementalProvisioningWhenNoFullSyncIsUnderway, is user configurable.

      We are finding that, when very large groups (involving tens of thousands of users) are added or deleted, a FullSync is scheduled by PSPNG and although we are able to set a high value for coordinationTimeout_secs, the FullSync times out after 300 seconds, because of the hard coded value of fullSyncTimeout_secs, so the sync fails to be performed.

      See error below:

      300 seconds,clog=clog #1879294 / ChangeLog type: group: updateGroup,group=XXXXXXXXXXXXXXXX]
      2018-12-05 16:01:02,616: [DefaultQuartzScheduler_Worker-10] WARN  PspChangelogConsumerShim.processChangeLogEntries(111) -  - Provisioning summary: Summary: 998 successes/1 failures.  (998 successful entries
      will be retried because they follow a failure in the queue.) First error was: FullSync timed out after 300 seconds
      2018-12-05 16:01:02,617: [DefaultQuartzScheduler_Worker-10] ERROR ChangeLogHelper.processRecords(286) -  - Did not get all the way through the batch! -1 != 1880292
      2018-12-05 16:01:02,621: [DefaultQuartzScheduler_Worker-10] ERROR GrouperLoaderJob.runJob(485) -  - Error on job: CHANGE_LOG_consumer_licensing
      java.lang.RuntimeException: Error in loader job: null, check logs: Summary: 998 successes/1 failures.  (998 successful entries will be retried because they follow a failure in the queue.) First error was: Fu
      llSync timed out after 300 secondsDid not get all the way through the batch! -1 != 1880292
              at edu.internet2.middleware.grouper.app.loader.GrouperLoaderJob.runJob(GrouperLoaderJob.java:474)
              at edu.internet2.middleware.grouper.app.loader.GrouperLoaderJob.execute(GrouperLoaderJob.java:345)
              at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
              at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573)
      2018-12-05 16:01:02,627: [DefaultQuartzScheduler_Worker-10] ERROR GrouperLoaderJob.execute(348) -  - Error running up job
      java.lang.RuntimeException: Error in loader job: null, check logs: Summary: 998 successes/1 failures.  (998 successful entries will be retried because they follow a failure in the queue.) First error was: Fu
      llSync timed out after 300 secondsDid not get all the way through the batch! -1 != 1880292
              at edu.internet2.middleware.grouper.app.loader.GrouperLoaderJob.runJob(GrouperLoaderJob.java:474)
              at edu.internet2.middleware.grouper.app.loader.GrouperLoaderJob.execute(GrouperLoaderJob.java:345)
              at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
              at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573)

      Attachments

        Activity

          People

            bert.beelindgren@at.internet2.edu Bert Bee-Lindgren
            sla23 Stefano Angioni (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: