When you’re just getting started with Python, it’s easy to feel overwhelmed by all the things you could build. That’s why I’ve been doing practical project-based learning—small, real-world tools I can actually use or explain.
This Log File Parser is one of them. If you’ve ever worked in IT or sysadmin, you know how chaotic system logs can be. This tool helps filter out the noise so you can find the signal.
What You’ll Learn
- Reading large files line by line (efficient file I/O)
- String parsing using
.split()
andregex
- Filtering logs by keyword, time, or error level
- Thinking like a sysadmin or incident responder
Project Setup
Create a new folder called log_file_parser
and inside it, create:
parser.py
– our main scriptsample.log
– your test log file
Example sample.log
:
2025-08-01 13:45:23 INFO User login successful
2025-08-01 13:46:01 ERROR Failed to fetch data
2025-08-01 13:47:12 WARNING Disk space low
2025-08-01 13:48:33 INFO Backup completed
2025-08-01 13:49:45 ERROR Timeout while connecting to database
Step-by-Step with Fully Commented Code
Read the File Line by Line
# Open the log file using a context manager
with open('sample.log', 'r') as file:
# Loop through each line in the file
for line in file:
# Print each line, removing the extra newline at the end
print(line.strip())
✅ This confirms that you’re able to access and read the log file correctly.
Filter by ERROR Level
# Open the log file
with open('sample.log', 'r') as file:
# Go through each line in the file
for line in file:
# Split the line into 3 parts: date, time, and the rest (log level + message)
parts = line.strip().split(" ", 2)
# Combine date and time to form the full timestamp
timestamp = parts[0] + " " + parts[1]
# The remaining part of the line includes the log level and message
log_level_and_message = parts[2]
# If the line starts with 'ERROR', print it with the timestamp
if log_level_and_message.startswith("ERROR"):
print(f"{timestamp} - {log_level_and_message}")
✅ This helps you extract only the error logs from the entire file.
Filter by Keyword
# Define the keyword you want to search for in the logs
keyword = "database"
# Open the log file
with open('sample.log', 'r') as file:
# Check each line to see if it contains the keyword
for line in file:
# Convert both the line and keyword to lowercase to make the search case-insensitive
if keyword.lower() in line.lower():
# Print the matching line
print(line.strip())
✅ You can replace "database"
with anything else, like "backup"
or "disk"
.
Filter Logs by Time Range (Optional Bonus)
from datetime import datetime # Import datetime to handle timestamp comparisons
# Define your start and end time as datetime objects
start_time = datetime.strptime("2025-08-01 13:46:00", "%Y-%m-%d %H:%M:%S")
end_time = datetime.strptime("2025-08-01 13:49:00", "%Y-%m-%d %H:%M:%S")
# Open the log file
with open('sample.log', 'r') as file:
for line in file:
# Split the line into parts again
parts = line.strip().split(" ", 2)
# Rebuild the timestamp from the date and time parts
timestamp = datetime.strptime(parts[0] + " " + parts[1], "%Y-%m-%d %H:%M:%S")
# If the timestamp is within your desired time range, print the log
if start_time <= timestamp <= end_time:
print(line.strip())
✅ This is useful when you’re investigating what happened between two specific times.
Full Project on GitHub
🔗 github.com/TitusGitari/log-file-parser
Tools & Resources I Used
Final Thoughts
This project is small, but it’s a real-world skill. Whether you’re triaging an incident, debugging a system, or just want to get more comfortable with parsing data, log file parsers are something you’ll encounter again and again.