Issue #371: Timeout when verbose=True and lots of results - greenwave

greenwave

#371 Timeout when verbose=True and lots of results

Closed: Fixed 5 years ago by gnaponie. Opened 5 years ago by gnaponie.

TLDR: greenwave hits timeout when tries to retrieve all the results to put in the decision response when the verbose flag == true, if the results are too many.

Long version:
When Greenwave needs to take a decision, in the response, if the verbose flag is true, it puts all the results that match the subject_identifier and subject_type, not considering the testcase. That means that it doesn't return only the results that are involved in the decision making, but all the matching ones for the subject.
When Greenwave asks for these results, it does one request to resultsdb for each result, this was done for performances reasons (the common case will just have not so many result, so it was convenient just ask for that one).
The issue is that for some cases, the matching results are thousands... and because of that Greenwave starts making thousands of requests to resultsdb and the decision request get a timeout before the "retrieving" ends.

I see many problems in that:
- the obvious one: timeout... the user cannot gate, because greenwave loses time just retrieving information that 90% of the time are completely useless.
- overhead in greenwave: if you need results, why don't you ask directly to resulstdb?
- overhead in resultsdb: it receives thousands of requests for such useless information and for each of them resultsdb needs to perform a query on the database.

POSSIBLE SOLUTIONS:
1. truncate all these results, and return a smaller amount of results, even a "big" number would be better (200? 100?). Downside: some other stuff expects Greenwave to return all of the results.
2. Let's just don't do anything: if you hit timeout, disable the flag verbose.
3. Create a $new_flag that returns only the results involved in the decision making, and leave the Verbose flag as it is. So that who needs to can still use it (example Bodhi).
4. paginate greenwave response with all the results: I don't like this one, it puts a lot of overhead in greenwave and resultsdb.
5. change the limit when greenwave ask resultsdb for the results. Even limit=20 (default) would be fine by me. But some investigation was done in the past and that caused performances problems. would still proceed with this one. Some response a bit more slow is better than some fast and some in timeout. A slow response is still a response, a timeout is blocking people with their gating.

cc. @bowlofeggs @lholecek @mprahl @lucarval @mvadkert @ralph

mprahl commented 5 years ago

When Greenwave needs to take a decision, in the response, if the verbose flag is true, it puts all the results that match the subject_identifier and subject_type, not considering the testcase. That means that it doesn't return only the results that are involved in the decision making, but all the matching ones for the subject.

Having just started to look at the Greenwave code, I'm not entirely sure what the right answer is, but would it be possible to just return the results that were used to make a gating decision?

Another idea would be to improve the filtering in ResultsDB's API so that you could get all the results in one request. Greenwave wouldn't necessarily have to return those results either. It could just return the URL to ResultsDB the caller needs to get the results. This could lead to CORS issues, but we can likely workaround that.

gnaponie commented 5 years ago

Having just started to look at the Greenwave code, I'm not entirely sure what the right answer is, but would it be possible to just return the results that were used to make a gating decision?

Here some background: https://pagure.io/greenwave/issue/287, https://pagure.io/greenwave/pull-request/289
The main issue is that Bodhi relies on those data. So we cannot remove them. It would break backwards compatibility.

Another idea would be to improve the filtering in ResultsDB's API so that you could get all the results in one request. Greenwave wouldn't necessarily have to return those results either. It could just return the URL to ResultsDB the caller needs to get the results. This could lead to CORS issues, but we can likely workaround that.

That might be an idea, but the services that relies on those data would need to change their code making an additional request to ResultsDB with the URL provided by Greenwave (one of these services is for sure Bodhi).

Edited 5 years ago by gnaponie

lucarval commented 5 years ago

Does aggregating results make sense here?

From what I understand, Greenwave will use only the latest result for a given test. For example, there may be 100 entries for test X for a given artifact. In this case, 99 of these entries can be ignored. I don't see a lot of value in returning all 100 results. If clients do care about these results, then I think it makes sense for them to get it from resultsdb.

This might drastically reduce the amount of results that need to be fetched from resultsdb by Greenwave. It also would do away with having to request 1 result at a time until we get what we want.

This may require API changes to resultsdb to: 1) aggregate all results for a given artifact, 2) pick latest of each result "group"; 3) return matched results

Another example, if resultsdb contains the following results for a given artifact:

ID -> Test

1 -> X
2 -> X
..
9 -> X

10 -> Y
11 -> Y
...
99 -> Y

I'd expect a query to resultsdb to return the test results 9 and 99.

gnaponie commented 5 years ago

After some discussion with Luiz and Randy, we realized that Bodhi doesn't need to show all the returned results by Greenwave, but only the last result for each testcase.
This already reduces a lot the list of the results.
And it won't break anything. Let's do this.

gnaponie commented 5 years ago

PR: https://pagure.io/greenwave/pull-request/378

ralph commented 5 years ago

Did #378 have the intended effect?

gnaponie commented 5 years ago

We are waiting to release #378 because it doesn't support the scenarios since we are still waiting on the change in ResultsDB.
Once the change in ResultsDB will get done, this will still need a small change, and then we'll release it and unblock the bodhi issue...

gnaponie commented 5 years ago

The PR got merged. Closing this one.

Metadata Update from @gnaponie:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

5 years ago

Metadata

Assignee

None

Tags

None

Blocking

None

Depending on

None

Milestone

None

greenwave

Source Code

Documentation

#371 Timeout when verbose=True and lots of results Closed: Fixed 5 years ago by gnaponie. Opened 5 years ago by gnaponie.

Metadata

#371 Timeout when verbose=True and lots of results

Closed: Fixed 5 years ago by gnaponie. Opened 5 years ago by gnaponie.