We are doing most of the parsing with utility functions in the ./countme/regex.py We need to have these covered by tests.
The regexes in questions are there to help parse the loglines, such as:
195.181.167.200 - - [12/Jul/2021:00:00:10 +0000] "GET /metalink?repo=fedora-modular-33&arch=x86_64&countme=3 HTTP/2.0" 200 26520 "-" "libdnf (Fedora 33; cloud; Linux.x86_64)"
One such regex is i.e.
LOG_PATTERN_FORMAT = ( r"^" r"(?P<host>{host})\s" r"(?P<identity>{identity})\s" r"(?P<user>{user})\s" r"\[(?P<time>{time})\]\s" r'"(?P<method>{method})\s' r"(?P<path>{path})(?:\?(?P<query>{query})){query_match}" r'\s(?P<protocol>{protocol})"\s' r"(?P<status>{status})\s" r"(?P<nbytes>{nbytes})\s" r'"(?P<referrer>{referrer})"\s' r'"(?P<user_agent>{user_agent})"\s*' r"$" )
We should verify we are parsing the lines correctly with these regexes.
We are using https://docs.pytest.org/en/6.2.x/ for writing tests. We are using the stadard python regex library: https://docs.python.org/3/library/re.html
Metadata Update from @nphilipp: - Issue untagged with: DNF Counting
Metadata Update from @nphilipp: - Issue tagged with: DNF Counting
Metadata Update from @patrikp: - Issue assigned to patrikp
WIP PR: #25
Current status: all the tests have been parametrized, and the base cases covered. #25 has been merged. I now intend to work on adding some invalid inputs to the tests. As there's no need for a new issue card, I'm moving this one from In Review back to In Progress, as to indicate that some work is still being done on it.
I would create a new issue for this.
Metadata Update from @asaleh: - Issue status updated to: Closed (was: Open)
Log in to comment on this ticket.