#1627 Malformed logs breaking composes
Closed: Fixed 2 years ago by lsedlar. Opened 2 years ago by lsedlar.

Fedora-Rawhide-20220817.n.0 failed in gather phase with this exception:

KeyError: '/mnt/koji/packages/zile/2.4.14/10.fc37/data/signed/eb10b464/src/zile-2.4.14-10.fc37.src.rpmProfiling results:'
Frame link_files in /usr/lib/python3.10/site-packages/pungi/phases/gather/link.py at line 93
<CODE>
...
-->   93         pkg_obj = pkg_by_path[pkg["path"]]
...
</CODE>
<LOCALS>
...
                 pkg = {'path': '/mnt/koji/packages/zile/2.4.14/10.fc37/data/signed/eb10b464/src/zile-2.4.14-10.fc37.src.rpmProfiling results:', 'flags': []}
...
</LOCALS>

Note that the path actually contains bogus extra text.

The path is obtained by parsing logs from depsolver, and those indeed contain such malformed data:

SRPM(): file:///mnt/koji/packages/zidrav/1.2.0/32.fc37/data/signed/eb10b464/src/zidrav-1.2.0-32.fc37.src.rpm
SRPM(): file:///mnt/koji/packages/zile/2.4.14/10.fc37/data/signed/eb10b464/src/zile-2.4.14-10.fc37.src.rpmProfiling results:
   23.85     1 DnfWrapper.fill_sack()
   19.25     1 Gather.gather()

The paths are printed to stdout, logs and profiling information to stderr. Both are merged into the same file. The lines can be mixed in whatever way, that will not be a problem. But joining two lines is a problem.

Here's a small reproducer:

$ cat tester.py 
for run_no in range(1, 1000):
    print(f"Attempt {run_no}")

    proc = subprocess.Popen(
        ["./runner.py"], stdout=subprocess.PIPE, stderr=subprocess.STDOUT
    ) 
    output, _ = proc.communicate()

    if b"oe" in output or b"eo" in output:
        print("Found mixed lines")
        break

$ cat runner.py 
#!/usr/bin/env python

import sys
import random

for _ in range(1000):
    print("o" * random.randint(10, 20),)
print("e", file=sys.stderr)

The runner prints o to stdout, and e to stderr. By itself it will never mix the two characters on the same line. The tester repeatedly runs it until it the lines get mixed. On my laptop that usually needs less than 100 iterations.


I managed to replicate it with just bash by running python runner.py 2>&1 | egrep 'oe' repeatedly. Eventually it hits it too.

A simple workaround for the particular case encountered in Rawhide is to flush stdout before printing the profiler data: #1628.
A better fix would be to stop redirecting stderr to stdout. That is not possible with kobo.shortcuts.run, so that will need to be replaced with something else.

Metadata Update from @lsedlar:
- Issue set to the milestone: 4.3.7

2 years ago

Log in to comment on this ticket.

Metadata
Related Pull Requests
  • #1628 Merged 2 years ago