#11685 koji web interface is extremely slow / times out
Closed: Fixed with Explanation a year ago by kevin. Opened a year ago by marcan.

For the past weeks, the Koji web interface has been very slow more often than not. This manifests as pages loading very slowly (time to begin loading, it's not the resources/rest of the page). Sometimes I get a 504 Gateway Timeout, which implies this is purely a backend issue (not something with internet routing between the infra and me).

As of right now, load times are in the 30+-second range. A few minutes ago I got some gateway timeouts. It's not always this bad, but it's never as fast as the rest of the infra, and sometimes it's completely unusable with repeated 504s.

FWIW, I'm located in Japan so I'm used to network-related latency, but the Koji issue is on another level. E.g. right now, Pagure pages take 2-5 seconds to load, which is about what I'd expect.


Yep. We have been having problems with the database. I've spent a bunch of time trying to tune it to no good luck. ;(

Likely we need to partition a gigantic table, but thats going to take an outage... ;(

So, let me use this ticket to dump status (I was meaning to file one, but hadn't yet).

The high load seems at least somewhat cyclical. It usually hits at :50 after and until about :10 after the top of the hour. You can see this in the load graphs:

https://admin.fedoraproject.org/collectd/bin/index.cgi?hostname=db-koji01.iad2.fedoraproject.org&plugin=load&timespan=86400&action=show_selection&ok_button=OK

I've done a bunch of tweaking, but there's so many ways to tweak things and it's difficult to see what if anything is having effects. This includes: tuned profiles, io schedulers, postgresql: max_workers, parallel workers, bunches of buffer and iosettings, etc. I keep hoping I can get it to a stable state for now so I can relax on my vacation, but so far it's not happened. ;(
I did another round of tweaking this morning, and it seems a bit better, but only time will tell.

Some observations on the koji queries:

  1. The buildroot_listing and tasks tables are massive:
           Table           |    Size    | External Size 
---------------------------+------------+---------------
 buildroot_listing         | 789 GB     | 326 GB
 task                      | 164 GB     | 17 GB

The koji database howto describes how to partition the buildroot_listing table, but I am not sure if anyone has done this and if it's still up to date. I would want to do the process in staging first to work out any issues, and of course it would require an outage in production. Possibly a longish one, depending on how long the copying took. Running it in stg should also tell us somewhat how long it would take in prod.

  1. Consistently, the tasks I see running longer than a minute are rollbacks. This I think is due to koji trying to close connections, but I am unsure why they are taking so long to finish. There shouldn't be anything in them. ;(
   pid   |    duration     |                       query                        
---------+-----------------+----------------------------------------------------
 3857408 | 00:01:07.314667 | ROLLBACK
 3857746 | 00:01:07.417646 | ROLLBACK
 3857948 | 00:01:24.198496 | ROLLBACK

CC: @mikem @tkopecek for any advise. ;)

Metadata Update from @kevin:
- Issue assigned to kevin
- Issue priority set to: Waiting on Assignee (was: Needs Review)
- Issue tagged with: high-gain, high-trouble, ops

a year ago

I am not sure I got the full query but:

explain SELECT buildroot.container_arch, buildroot.br_type, buildroot.cg_id, content_generator.name, buildroot.cg_version, buildroot.container_arch, buildroot.container_type, create_events.id, create_events.time, date_part('epoch', create_events.time), buildroot.extra, buildroot.host_arch, host.id, host.name, buildroot.host_os, buildroot.id, repo_create.id, repo_create.time, repo.id, repo.state, retire_events.id, retire_events.time, date_part('epoch', retire_events.time), standard_buildroot.state, tag.id, tag.name, standard_buildroot.task_id from buildroot LEFT OUTER JOIN standard_buildroot ON standard_buildroot.buildroot_id = buildroot.id LEFT OUTER JOIN content_generator ON buildroot.cg_id = content_generator.id LEFT OUTER JOIN host ON host.id = standard_buildroot.host_id  LEFT OUTER JOIN repo ON repo.id = standard_buildroot.repo_id LEFT OUTER JOIN tag ON tag.id = repo.tag_id LEFT OUTER JOIN events AS create_events ON create_events.id = standard_buildroot.create_event LEFT OUTER JOIN events AS retire_events ON standard_buildroot.retire_event = retire_events.id LEFT OUTER JOIN events AS repo_create ON repo_create.id = repo.create_event WHERE (host.id = 561) AND (standard_buildroot.state IN (0, 1, 2)) ;
                                                                QUERY PLAN                           

-----------------------------------------------------------------------------------------------------
--------------------------------------
 Nested Loop Left Join  (cost=1007.40..497698.48 rows=1 width=599)
   ->  Nested Loop Left Join  (cost=1006.84..497697.18 rows=1 width=569)
         ->  Nested Loop Left Join  (cost=1006.27..497688.60 rows=1 width=561)
               ->  Nested Loop Left Join  (cost=1005.70..497680.01 rows=1 width=553)
                     ->  Nested Loop Left Join  (cost=1005.41..497679.70 rows=1 width=515)
                           ->  Nested Loop  (cost=1004.98..497671.25 rows=1 width=503)
                                 ->  Nested Loop Left Join  (cost=1000.70..497662.95 rows=1 width=466)
                                       ->  Nested Loop  (cost=1000.57..497662.79 rows=1 width=452)
                                             ->  Gather  (cost=1000.00..497654.21 rows=1 width=28)
                                                   Workers Planned: 6
                                                   ->  Parallel Seq Scan on standard_buildroot  (cost=0.00..496654.11 rows=1 width=28)
                                                         Filter: ((host_id = 561) AND (state = ANY ('{0,1,2}'::integer[])))
                                             ->  Index Scan using buildroot_host_id_idx on buildroot (cost=0.56..8.58 rows=1 width=428)
                                                   Index Cond: (id = standard_buildroot.buildroot_id)
                                       ->  Index Scan using content_generator_pkey on content_generator  (cost=0.13..0.15 rows=1 width=18)
                                             Index Cond: (id = buildroot.cg_id)
                                 ->  Bitmap Heap Scan on host  (cost=4.28..8.30 rows=1 width=41)
                                       Recheck Cond: (id = 561)
                                       ->  Bitmap Index Scan on host_pkey  (cost=0.00..4.28 rows=1 width=0)
                                             Index Cond: (id = 561)
                           ->  Index Scan using repo_pkey on repo  (cost=0.43..8.45 rows=1 width=16)
                                 Index Cond: (id = standard_buildroot.repo_id)
                     ->  Index Scan using tag_pkey on tag  (cost=0.29..0.31 rows=1 width=42)
                           Index Cond: (id = repo.tag_id)
               ->  Index Scan using events_id_idx_ccnew on events create_events  (cost=0.57..8.59 rows=1 width=12)
                     Index Cond: (id = standard_buildroot.create_event)
         ->  Index Scan using events_id_idx_ccnew on events retire_events  (cost=0.57..8.59 rows=1 width=12)
               Index Cond: (id = standard_buildroot.retire_event)
   ->  Index Scan using events_id_idx_ccnew on events repo_create  (cost=0.57..1.29 rows=1 width=12)
         Index Cond: (id = repo.create_event)
(30 rows)

Judging by the CPU graphs it looks like it's purely a compute issue, since it looks like everything is maxed out during the spikes (but iowait doesn't stand out as pointing at IO). So yeah, probably expensive queries, missing indices, or similar database side things? I imagine tweaking IO or worker/parallelism configs won't make a difference here, and we should focus on the actual queries and schema.

Parallel Seq Scan on standard_buildroot

That inner scan looks suspicious, how big is that table? AIUI this will scan the whole table (presumably because it doesn't have an index over the queried columns).

           Table           |    Size    | External Size 
 standard_buildroot        | 3755 MB    | 1018 MB

Because it was easy I tossed more cpus at it... 64 -> 88

Yep, it looks that you can benefit from some additional index. I'm not sure if we should add them to default koji code, but at least some of them yes.
I've following:

"standard_buildroot_pkey" PRIMARY KEY, btree (buildroot_id)
"standard_buildroot_create_event" btree (create_event)
"standard_buildroot_host_id" btree (host_id)
"standard_buildroot_repo_id" btree (repo_id)
"standard_buildroot_retire_event" btree (retire_event)
"standard_buildroot_task_id" btree (task_id)

Depending on usage patterns most of them are not used at all. So, it is worth to see statistics later and drop those which are not used. Here it looks that at least host_id one would be beneficious.

koji=# create index concurrently standard_buildroot_host_id on standard_buildroot(host_id);
CREATE INDEX
 Nested Loop Left Join  (cost=7.71..79939.08 rows=1 width=599)
   ->  Nested Loop Left Join  (cost=7.15..79937.78 rows=1 width=569)
         ->  Nested Loop Left Join  (cost=6.58..79929.19 rows=1 width=561)
               ->  Nested Loop Left Join  (cost=6.01..79920.61 rows=1 width=553)
                     ->  Nested Loop Left Join  (cost=5.72..79920.30 rows=1 width=515)
                           ->  Nested Loop  (cost=5.29..79911.85 rows=1 width=503)
                                 ->  Nested Loop Left Join  (cost=1.00..79903.55 rows=1 width=466)
                                       Join Filter: (buildroot.cg_id = content_generator.id)
                                       ->  Nested Loop  (cost=1.00..79902.48 rows=1 width=452)
                                             ->  Index Scan using standard_buildroot_host_id on standard_buildr
oot  (cost=0.44..79893.90 rows=1 width=28)
                                                   Index Cond: (host_id = 561)
                                                   Filter: (state = ANY ('{0,1,2}'::integer[]))
                                             ->  Index Scan using buildroot_host_id_idx on buildroot  (cost=0.5
6..8.58 rows=1 width=428)
                                                   Index Cond: (id = standard_buildroot.buildroot_id)
                                       ->  Seq Scan on content_generator  (cost=0.00..1.03 rows=3 width=18)
                                 ->  Bitmap Heap Scan on host  (cost=4.28..8.30 rows=1 width=41)
                                       Recheck Cond: (id = 561)
                                       ->  Bitmap Index Scan on host_pkey  (cost=0.00..4.28 rows=1 width=0)
                                             Index Cond: (id = 561)
                           ->  Index Scan using repo_pkey on repo  (cost=0.43..8.45 rows=1 width=16)
                                 Index Cond: (id = standard_buildroot.repo_id)
                     ->  Index Scan using tag_pkey on tag  (cost=0.29..0.31 rows=1 width=42)
                           Index Cond: (id = repo.tag_id)
               ->  Index Scan using events_id_idx_ccnew on events create_events  (cost=0.57..8.59 rows=1 width=
12)
                     Index Cond: (id = standard_buildroot.create_event)
         ->  Index Scan using events_id_idx_ccnew on events retire_events  (cost=0.57..8.59 rows=1 width=12)
               Index Cond: (id = standard_buildroot.retire_event)
   ->  Index Scan using events_id_idx_ccnew on events repo_create  (cost=0.57..1.29 rows=1 width=12)
         Index Cond: (id = repo.create_event)
(29 rows)

yep. vast improvement from that one index.

With that index in place... everything is back to fine and responsive.

I am going to close this now, we can track adding those indexes in upstream issue. :)

Metadata Update from @kevin:
- Issue close_status updated to: Fixed with Explanation
- Issue status updated to: Closed (was: Open)

a year ago

Log in to comment on this ticket.

Metadata
Boards 1
ops Status: Backlog