Uploaded image for project: 'Grouper'
  1. Grouper
  2. GRP-1600

change log temp needs an index on created date

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • 2.4.0
    • None
    • API
    • None

    Description

      From: grouper-users-request@internet2.edu grouper-users-request@internet2.edu On Behalf Of Hyzer, Chris
      Sent: Tuesday, September 05, 2017 4:17 PM
      To: Gettes, Michael <gettes@ufl.edu>
      Cc: Black, Carey M. <black.123@osu.edu>; grouper-users@internet2.edu
      Subject: RE: [grouper-users] RE: How big should your temp change log be? ( Loader questions )

      When I look at it, I run that query in oracle and the explain plan has a full table scan. When I add the index it does not (cost 0).

      I will discuss with Shilen to see if he agrees to add it.

      FYI, this is the script I ran in oracle:

      CREATE INDEX change_log_temp_created_on_idx ON GROUPER_CHANGE_LOG_ENTRY_TEMP (CREATED_ON);

      Note, patches cant run DDL at this point, but maybe we can add an enhancement for that…

      Thanks
      Chris

      From: Gettes, Michael gettes@ufl.edu
      Sent: Tuesday, September 05, 2017 3:47 PM
      To: Hyzer, Chris <mchyzer@isc.upenn.edu>
      Cc: Black, Carey M. <black.123@osu.edu>; grouper-users@internet2.edu
      Subject: Re: [grouper-users] RE: How big should your temp change log be? ( Loader questions )

      I had this same problem at a previous place of employment. Never got resolved but we were heading down the path of putting the grouper DB on SSD to speed up the read/write/delete operation. The DBAs never indicated there were lack of indices problems but we definitely had the temp->main changelog as choke-point problem. When the “disk” could keep up - it went fast. It really presented as an I/O problem.

      I hope this helps.

      /mrg

      On Sep 5, 2017, at 3:42 PM, Hyzer, Chris <mchyzer@isc.upenn.edu> wrote:

      Reading a record, and writing a record, and deleting a record, should be fast. Has nothing to do with the loader jobs. The theads inserting into the temp table are equal to any thread using the Java Grouper API. So it depends on the number of WS, UI, and loader threads....

      As far as local SQL changes, I don't think that is that common. Local admins just need to run that after upgrading.

      Thanks
      Chris

      ----Original Message----
      From: Black, Carey M. black.123@osu.edu
      Sent: Tuesday, September 05, 2017 3:06 PM
      To: Hyzer, Chris <mchyzer@isc.upenn.edu>
      Cc: grouper-users@internet2.edu
      Subject: RE: How big should your temp change log be? ( Loader questions )

      Chris,

      Thank you for the details.

      FWIW:
      I stopped running all of the loader jobs.
      The grouper_change_log_entry_temp table has cleared itself over a few days.

      So I think it is reasonable to believe that the jobs were submitting more records into the temp table than the loader process was able to convert to the "grouper_change_log_entry" table.
      Assuming that the select from the temp table was a "slow point", then maybe the index will help the loader to keep up with the queue.
      If the index does not resolve it, then I would ask if the "thread counts" match between the loader processes that generate the "_temp" entries vs the ones moving the rows to the "grouper_change_log_entry"?

      For now, I will add the index and start up a few of the jobs and see if I can find the "max rate" point for our system.

      Just a thought:
      Since there may be a local need for "DB customizations", maybe supporting some kind of "local sql script" to run after an upgrade would be helpful?

      If there were a standard way(config file?) to register/track local DB changes then the upgrade could be "pre-wired" for the upgrade to auto apply the local changes too.
      I do recognize that some SQL might not work after the upgrade. However "safe things" (like adding indexes and adding custom Tables) should work. Views based on the grouper tables might fail due to grouper RDBMS structure changes. But that would be a good thing to find in the upgrade process anyways.
      Maybe the SQL file could be conditionally run at startup of the loader too? ( Kind of an "All Sync" type option?)


      Carey Matthew Black.123@osu.edu

      ----Original Message----
      From: Hyzer, Chris mchyzer@isc.upenn.edu
      Sent: Tuesday, September 5, 2017 2:07 PM
      To: Black, Carey M. <black.123@osu.edu>; grouper-users@internet2.edu
      Subject: RE: How big should your temp change log be? ( Loader questions )

      Carey, can you add an index on that column and see if it improves things? You can explain plan that query to see if it works If so, Shilen, do you think we should add that index?

      Carey, have you analyzed all your tables recently? Not sure if rebuilding indexes helps in mysql, but it helped us at penn...

      This "temp table" is auto dropped/created by the system?

      No, it's a permanent temp table

      When is the temp table "cleaned up"? (or should it never has "stuck rows" after the job finishes?)

      The loader will take rows, slap another timestamp on them, and put them in grouper_change_log_entry and remove from grouper_change_log_entry_temp

      Can I just stop the loader then remove all rows from this temp table?

      Change log entries wont be sent out, but yes, you can do that

      There is some reason to NOT add an index to the created_on column?

      I don't think so

      Is it "safe" to do things like adding indexes to grouper's "core tables"?

      Sure. Document it. make sure on major upgrades you reapply the index if it is removed. Also let us know if we should consider adding it to core.

      Are there any other indexes that I should be manually adding? ( or verifying that they were created properly?)

      Not that I know of.

      Why is the default 14 days before the change log is trimmed? ( Why not 0, 1 or 2? )

      They should be removed as loader copies them over. So it should never go up to 14. But if your loader is turned off for a while they should still progress through the change log notifications. If two weeks has gone by, then all bets off.

      Thanks
      Chris

      ----Original Message----
      From: grouper-users-request@internet2.edu grouper-users-request@internet2.edu On Behalf Of Black, Carey M.
      Sent: Wednesday, August 30, 2017 11:08 PM
      To: grouper-users@internet2.edu
      Subject: [grouper-users] How big should your temp change log be? ( Loader questions )

      All,

      I suspect that I have been pushing the loader process a bit harder than I should. (Or I am finding yet more configuration that I need to "tune" to keep things humming along.)

      My DBA reported a "slow query" and I think this is related to loader jobs not "completing".

      – Connection Id: NNNNN
      – User: DB_USER
      – Host: host:port
      – DB: grouper
      – Command: Query
      – Time: 291
      – State: Creating sort index
      select changeloge0_.id as id1_14_, changeloge0_.change_log_type_id as change_l2_14_, changeloge0_.context_id as context_3_14_, changeloge0_.created_on as created_4_14_, changeloge0_.string01 as string5_14_, changeloge0_.string02 as string6_14_, changeloge0_.string03 as string7_14_, changeloge0_.string04 as string8_14_, changeloge0_.string05 as string9_14_, changeloge0_.string06 as string10_14_, changeloge0_.string07 as string11_14_, changeloge0_.string08 as string12_14_, changeloge0_.string09 as string13_14_, changeloge0_.string10 as string14_14_, changeloge0_.string11 as string15_14_, changeloge0_.string12 as string16_14_ from grouper_change_log_entry_temp changeloge0_ order by changeloge0_.created_on limit 1000

      While that is a bit odd to my eye... it basically is a:
      Select a, b, c ... from grouper_change_log_entry_temp changeloge0_ order by changeloge0_.created_on limit 1000

      Some insight into the data in the table....

      select count from grouper_change_log_entry_temp – '6,745,965' That is "only" 6.7 M rows, in a "temp" table. ( Uh...? )
      select min(created_on) from grouper_change_log_entry_temp; – '1503703639031000' --? Friday, August 25, 2017 11:27:19 PM GMT
      select max(created_on) from grouper_change_log_entry_temp; – '1504121955115000' --? Wednesday, August 30, 2017 7:39:15 PM GMT

      – Assuming I am converting those dates properly...
      – That is just under 5 day's worth of data. ( And I think it will not start to trim it till 14 days. ( default value for: loader.retain.db.change_log_entry.days Or does that apply to the "non-temp" loader table? )

      When I look at the grouper_change_log_entry_temp table, I do not see an index on the created_on column.

      Based on reading of mariadb v10.2.8:
      REF: https://urldefense.proofpoint.com/v2/url?u=https-3A__mariadb.com_kb_en_the-2Dmariadb-2Dlibrary_improvements-2Dto-2Dorder-2Dby_&d=DwIFAg&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=EUBHI54mtQDlbcqo5rUTdQ&m=w8JE-eXRCDxqQ4YtKvCq9IOMSrJvy-hCPczf4w1jRHY&s=WvlD0A3kZdwVIoAPwUBnJ4QNo8xm3bBzfNTKWif9ty4&e=
      It seems like there really should be an index on the created_on column to avoid performance issues and speed up the "limit" function of that select statement.

      Can anyone tell me If:
      This "temp table" is auto dropped/created by the system?
      When is the temp table "cleaned up"? (or should it never has "stuck rows" after the job finishes?)
      Can I just stop the loader then remove all rows from this temp table?
      There is some reason to NOT add an index to the created_on column?
      Is it "safe" to do things like adding indexes to grouper's "core tables"?
      Are there any other indexes that I should be manually adding? ( or verifying that they were created properly?)
      Why is the default 14 days before the change log is trimmed? ( Why not 0, 1 or 2? )

      On maybe a more general "DB Config" topic...
      select count from grouper_memberships_all_v; – produces a 30 sec timeout result. No count returned. DB connection dead. Uh.... Is there some overarching DB/DB client configuration issue that is causing me pain?
      Did I miss a section on Grouper's required DB configuration settings somewhere?

      Is there any additional information/ configuration details that could help any of these questions be answered? ( Let me know.)

      Thanks in advance.


      Carey Matthew Black.123@osu.edu

      Attachments

        Activity

          People

            shilen.patel@at.internet2.edu Shilen Patel (duke.edu)
            chris.hyzer@at.internet2.edu Chris Hyzer (upenn.edu)
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: