fedora-review v0.7.6 fails with an exception if the spec contains a Source URL with a space in it.
02-18 15:32 root INFO Downloading (Source0): https://sourceforge.net/projects/xtrkcad-fork/files/XTrackCad/Version 5.2.2/xtrkcad-source-5.2.2GA.tar.gz 02-18 15:32 root DEBUG Exception down the road... Traceback (most recent call last): File "/usr/lib/python3.9/site-packages/FedoraReview/review_helper.py", line 236, in run self._do_run(outfile) File "/usr/lib/python3.9/site-packages/FedoraReview/review_helper.py", line 226, in _do_run self._do_report(outfile) File "/usr/lib/python3.9/site-packages/FedoraReview/review_helper.py", line 99, in _do_report self._run_checks(self.bug.spec_file, self.bug.srpm_file, outfile) File "/usr/lib/python3.9/site-packages/FedoraReview/review_helper.py", line 108, in _run_checks self.checks = Checks(spec, srpm) File "/usr/lib/python3.9/site-packages/FedoraReview/checks.py", line 286, in __init__ self.data.sources = SourcesDataSource(self.spec) File "/usr/lib/python3.9/site-packages/FedoraReview/datasrc.py", line 248, in __init__ self.sources_by_tag[tag] = Source(tag, url) File "/usr/lib/python3.9/site-packages/FedoraReview/source.py", line 76, in __init__ self.filename = self._get_file(url, ReviewDirs.upstream, my_logger) File "/usr/lib/python3.9/site-packages/FedoraReview/helpers_mixin.py", line 109, in _get_file self.urlretrieve(link, path) File "/usr/lib/python3.9/site-packages/FedoraReview/helpers_mixin.py", line 86, in urlretrieve istream = urllib.request.urlopen(url) File "/usr/lib64/python3.9/urllib/request.py", line 214, in urlopen return opener.open(url, data, timeout) File "/usr/lib64/python3.9/urllib/request.py", line 517, in open response = self._open(req, data) File "/usr/lib64/python3.9/urllib/request.py", line 534, in _open result = self._call_chain(self.handle_open, protocol, protocol + File "/usr/lib64/python3.9/urllib/request.py", line 494, in _call_chain result = func(*args) File "/usr/lib64/python3.9/urllib/request.py", line 1389, in https_open return self.do_open(http.client.HTTPSConnection, req, File "/usr/lib64/python3.9/urllib/request.py", line 1346, in do_open h.request(req.get_method(), req.selector, req.data, headers, File "/usr/lib64/python3.9/http/client.py", line 1285, in request self._send_request(method, url, body, headers, encode_chunked) File "/usr/lib64/python3.9/http/client.py", line 1296, in _send_request self.putrequest(method, url, **skips) File "/usr/lib64/python3.9/http/client.py", line 1130, in putrequest self._validate_path(url) File "/usr/lib64/python3.9/http/client.py", line 1230, in _validate_path raise InvalidURL(f"URL can't contain control characters. {url!r} " http.client.InvalidURL: URL can't contain control characters. '/projects/xtrkcad-fork/files/XTrackCad/Version 5.2.2/xtrkcad-source-5.2.2GA.tar.gz' (found at least ' ')
Reverting commit c15b780 fixed this issue. According to commit message, that change was made to support some other kind of url, so just reverting that is not an option.
It seems that the url variable that is fetched is formed in line 171 of spec_file.py. On that line, unquote is called for the url, so the resulting string contains unquoted whitespace and thus is not a valid url. urllib.request.urlopen does not accept such invalid urls, as the stacktrace shows.
url
unquote
urllib.request.urlopen
Unquoting was added earlier in e854054. Amusingly, the reason was to support urls that contain %20 (i.e. percent-encoded whitespace).
%20
Investigation continues.
Pull request #438 resolves this.
Just to be make sure: The source url in the specfile still needs to be percent-encoded correctly (note the %20):
Source0: https://sourceforge.net/projects/xtrkcad-fork/files/XTrackCad/Version%205.2.2/xtrkcad-source-5.2.2GA.tar.gz
Log in to comment on this ticket.