Running External Commands with subprocess.run/shaare/K5w00Q

python

Running External Commands with `subprocess.run`

DevOps automation often requires invoking existing CLI tools or scripts to leverage their functionality without re-implementing it in Python.
The subprocess module provides a secure and flexible interface to spawn child processes, control their input/output streams, and inspect their exit statuses.
The modern recommended method is subprocess.run(), which combines execution, output capture, and error handling in a single call.

import subprocess
import sys

result = subprocess.run(
    [sys.executable, "-c", "print('Hello from subprocess.')"],
    capture_output=True,
    text=True
)

print(f"Return code: {result.returncode}")
print(f"Stdout: {result.stdout.strip()}")

Why `subprocess`? The Old Ways

Older approaches like os.system() invoke a shell directly, making them vulnerable to injection and offering limited control over I/O streams.
The subprocess module was introduced to provide finer control, better security, and a consistent API across platforms.
Functions such as subprocess.call(), check_output(), and Popen exist, but subprocess.run() (Python 3.5+) simplifies most common use cases into one interface.

+++

The subprocess.run() Function

args should be a list of strings where the first element is the command and the rest are its parameters.
capture_output=True captures both stdout and stderr into the returned CompletedProcess.
text=True decodes bytes into strings using the system’s default encoding.
check=True raises a CalledProcessError for non-zero exit codes, allowing you to handle failures via exceptions.
shell=False (the default) avoids invoking a shell, preventing injection vulnerabilities; use shell=True only if you fully control the command string.
The returned CompletedProcess has attributes args, returncode, stdout, and stderr for introspection.

import subprocess
import sys

cmd = [
    sys.executable,
    "-c",
    """print("Hello from subprocess")
invalid_function()"""
]

result = subprocess.run(cmd, capture_output=True, text=True)
print(f"Args: {result.args}")
print(f"Stdout: {result.stdout.strip()}")
print(f"Stderr: {result.stderr.strip()}")
print(f"Return code: {result.returncode}")

Basic Command Execution

Construct your command as a list, choosing the tool and its arguments explicitly.
Use capture_output=True and text=True to get human-readable strings.
Inspect result.returncode to determine if the command succeeded (zero) or failed (non-zero).

import subprocess
import platform

if platform.system() == "Windows":
    cmd = ["ver"]
else:
    cmd = ["uname", "-a"]

result = subprocess.run(cmd, capture_output=True, text=True)
print(result.stdout.strip())

Common Pitfalls & How to Avoid Them

Forgetting capture_output=True means result.stdout and result.stderr will be None, so you cannot inspect them.
Omitting text=True leaves you with raw bytes that require manual decoding.
Using check=False without checking result.returncode can let failures go unnoticed.
Invoking a shell with shell=True and untrusted input enables injection attacks—always prefer shell=False.
pythonpython

Temporary Files and Directories/shaare/rHyxJw

python

Temporary Files and Directories

Automation scripts often need scratch space for intermediate data without cluttering the filesystem or risking name collisions.
Hardcoding names like /tmp/my_file.txt can lead to security issues, collisions, and manual cleanup.
The tempfile module provides secure, unique temporary files and directories with optional automatic cleanup.

Why Use the tempfile Module?

It creates files with secure default permissions, preventing unauthorized access on multiuser systems.
It generates unique names automatically, avoiding collisions when multiple script instances run concurrently.
It integrates with context managers (with), enabling automatic cleanup of resources when they're no longer needed.
It works across Windows, macOS, and Linux, choosing an appropriate temp location on each platform.

import tempfile
import os

temp_dir = tempfile.gettempdir()
print(f"Default temporary directory: {temp_dir}")
print(f"Sample contents: {os.listdir(temp_dir)[:5]}")

tempfile.TemporaryFile()

Creates an unnamed temporary file opened in binary or text mode.
On UNIX-like systems it typically has no name in the filesystem; on Windows it may appear but remains temporary.
The file is deleted automatically when closed or when the context block exits.
Ideal for internal scratch space that doesn’t need to be passed to external processes.

import tempfile

with tempfile.TemporaryFile(mode="w+t", encoding="utf-8") as temp_file:
    temp_file.write("This is some temporary data.")
    temp_file.seek(0)
    print("Content from TemporaryFile:")
    print(temp_file.read())

tempfile.NamedTemporaryFile()

Creates a temporary file with a visible name in the filesystem.
Default delete=True removes the file when closed; delete=False leaves it for manual cleanup.
Use when you need to pass a filename to another process or library.
Supports custom suffix, prefix, and dir parameters for naming and placement.

import tempfile
from pathlib import Path

# Auto-delete on with exit
path = None

with tempfile.NamedTemporaryFile(mode="w+t", encoding="utf-8", suffix=".log") as temp_file:
    path = Path(temp_file.name)
    print(f"Created temp file at {path}. Exists: {path.exists()}")

print(f"After close. Exists? {path.exists()}")

# Persist after with exit
path_persistent = None

with tempfile.NamedTemporaryFile(
    mode="w+t",
    encoding="utf-8",
    suffix=".log",
    delete=False
) as temp_file:
    path_persistent = Path(temp_file.name)
    print(f"Created temp file at {path}. Exists: {path.exists()}")

print(f"After close. Exists? {path_persistent.exists()}")

if path_persistent.exists():
    path_persistent.unlink()

print(f"After unlink. Exists? {path_persistent.exists()}")

tempfile.TemporaryDirectory()

Creates a new temporary directory, returned as a path string.
When used in a with block, the directory and everything inside it are deleted on exit.
Ideal for workflows that produce multiple temporary files or subdirectories.

import tempfile
from pathlib import Path

temp_path = None

with tempfile.TemporaryDirectory(prefix="batch_job_") as temp_dir:
    print(f"{temp_dir} - type: {type(temp_dir)}")
    temp_path = Path(temp_dir)
    (temp_path / "file1.txt").write_text("data")
    subdir = temp_path / "subdir"
    subdir.mkdir(exist_ok=True)
    (subdir / "file2.txt").write_text("data2")
    print(f"Contents: {[p.name for p in temp_path.iterdir()]}")

print(f"After close. Exists? {temp_path.exists()}")

Common Pitfalls & How to Avoid Them

Calling os.rmdir() or Path.rmdir() on a non-empty directory raises an error; use shutil.rmtree() for recursive deletion.
Forgetting to delete files created with delete=False in NamedTemporaryFile can leave orphaned files.
On Windows, other processes can’t open an open temporary file. Use delete=False and close it before sharing the name.
Relying on a temporary file’s name after closing a TemporaryFile is impossible, since it may never have had one.
python

Filesystem Operations/shaare/smD7WQ

python

Filesystem Operations (os & shutil)

DevOps scripts often need to create, delete, copy, and move files and directories as part of automation workflows.
The os module provides low-level filesystem functions, while shutil offers higher-level operations like copying and recursive removal.
These tools work hand-in-hand with pathlib (for path manipulation) to build robust file management scripts.

Listing Directory Contents

Use os.listdir(path) to get a list of entry names (files and subdirectories) in a directory.
Use Path(path).iterdir() to iterate over Path objects, which you can query further with methods like .is_file() or .is_dir().
os.listdir returns a plain list of strings; iterdir() yields full Path objects, making downstream operations more convenient.

import os
from pathlib import Path
import shutil

"""
Directory structure:

temp_listing_dir/
├── file1.txt
├── file2.log
└── subdir/
    └── subfile.py
"""

tmp_path = Path("temp_listing_dir")
tmp_path.mkdir(exist_ok=True)
(tmp_path / "file1.txt").touch()
(tmp_path / "file2.log").touch()
(tmp_path / "subdir").mkdir(exist_ok=True)
(tmp_path / "subdir" / "subfile.py").touch()

print(f"--- os.listdir(\"{tmp_path}\") ---")
for name in os.listdir(tmp_path):
    print(name)

print(f"--- Path(\"{tmp_path}\").iterdir() ---")
for entry in tmp_path.iterdir():
    print(entry)

shutil.rmtree(tmp_path)

Creating Directories

os.mkdir(path) creates a single directory and fails if parents don’t exist or if it already exists.
os.makedirs(path, exist_ok=False) creates all intermediate directories; set exist_ok=True to ignore existing leaf.
Path(path).mkdir(parents=True, exist_ok=True) is the pathlib equivalent for recursive, idempotent creation.

from pathlib import Path
import shutil

single = Path("my_single_dir")

try:
    single.mkdir(exist_ok=True)
    print(f"Created {single}: {single.exists()}")
finally:
    if single.exists():
        single.rmdir()

nested = Path("parent/child/grandchild")
nested.mkdir(parents=True, exist_ok=True)
print(f"Created nested path {nested}: {nested.exists()}")

shutil.rmtree("parent")

Removing Files and Directories

os.remove(path) or Path(path).unlink() deletes a single file and raises if missing (unless missing_ok=True).
os.rmdir(path) or Path(path).rmdir() removes an empty directory only.
shutil.rmtree(path) recursively deletes a directory tree and all contents; use with extreme caution.

from pathlib import Path
import shutil

"""
Directory structure:

.
├── temp_file.txt
├── empty_dir/
└── tree_root/
    └── child/
        └── inner.txt
"""

temp_file = Path("temp_file.txt")
temp_file.touch()

empty_dir = Path("empty_dir")
empty_dir.mkdir(exist_ok=True)

tree = Path("tree_root/child")
tree.mkdir(parents=True, exist_ok=True)
(tree / "inner.txt").touch()

temp_file.unlink()
print(f"Removed file {temp_file}. Exists? {temp_file.exists()}")
empty_dir.rmdir()
print(f"Removed dir {empty_dir}. Exists? {empty_dir.exists()}")

shutil.rmtree("tree_root")
print(f"Removed \"tree_root\" recursively. Exists? {tree.exists()}")

Copying Files and Directories

shutil.copy(src, dst) copies a file but does not preserve metadata like timestamps or permissions.
shutil.copy2(src, dst) copies files and attempts to preserve metadata.
shutil.copytree(src_dir, dst_dir, dirs_exist_ok=False) recursively copies an entire directory tree.

import shutil
from pathlib import Path

"""
Directory structure:

src_copy/
├── a.txt
└── sub/
    └── b.txt
"""

src = Path("src_copy")
src.mkdir(exist_ok=True)
(src / "a.txt").write_text("A")
(src / "sub").mkdir(exist_ok=True)
(src / "sub" / "b.txt").write_text("B")

dest_file = Path("copied_a.txt")
dest_file_metadata = Path("copied_a_metadata.txt")
shutil.copy(src / "a.txt", dest_file)
shutil.copy2(src / "a.txt", dest_file_metadata)

dest_dir = Path("copied_src")

if dest_dir.exists():
    shutil.rmtree(dest_dir)

shutil.copytree(src, dest_dir)

shutil.rmtree("src_copy")
shutil.rmtree(dest_dir)
dest_file.unlink()
dest_file_metadata.unlink()

Moving Files and Directories

Use shutil.move(src, dst) to move or rename files and directories in one step.
If dst is an existing directory, src is moved into it; if dst names a file, src is renamed there.
Moving across filesystems may involve a copy-and-delete under the hood.

import shutil
from pathlib import Path

"""
Directory structure:

.
├── move_me.txt
├── move_dir/
│   └── inside.txt
└── dest_folder/
"""

file_src = Path("move_me.txt")
file_src.write_text("Moving file.")

dir_src = Path("move_dir")
dir_src.mkdir(exist_ok=True)
(dir_src / "inside.txt").write_text("Inside source dir.")

dest_dir = Path("dest_folder")
dest_dir.mkdir(exist_ok=True)

try:
    shutil.move(file_src, dest_dir)
except Exception as e:
    print(f"Error occurred: {e}")

file_src2 = dest_dir / file_src.name
new_name = Path("renamed.txt")
shutil.move(file_src2, new_name)

try:
    shutil.move(dir_src, dest_dir)
except Exception as e:
    print(f"Error occurred: {e}")

shutil.rmtree(dest_dir)
if new_name.exists():
    new_name.unlink()

Common Pitfalls & How to Avoid Them

PermissionError: Operations fail if the script lacks rights. Ensure correct ownership or run with appropriate privileges.
Non-empty Directories: os.rmdir() and Path.rmdir() only remove empty dirs. Use shutil.rmtree() for recursive deletion, but do so carefully.
Existing Destinations: shutil.copytree() errors if the target exists unless dirs_exist_ok=True. Consider pre-cleanup or that flag.
Irreversible Deletions: There is no undo for os.remove, os.rmdir, or shutil.rmtree(). Add confirmation or dry-run modes when deleting!

+++

Working with Environment Variables/shaare/DM1boQ

python

Working with Environment Variables

Environment variables are dynamic, named values provided by the operating system to running processes, enabling configuration of behavior without code modifications.
They allow applications to adapt across development, staging, and production environments by externalizing configuration data such as API keys, file paths, and feature flags.
Python’s os module offers simple interfaces to access and manage these variables, promoting separation of code and configuration.

import os

for key in ["HOME", "SHELL"]:
    value = os.getenv(key)
    print(f"{key} = {value if value else "Not set"}")

env_keys = list(os.environ.keys())
print(f"We have {len(env_keys)} environment variables available!")

for key in env_keys[:5]:
    print(key)

Accessing Environment Variables with `os.getenv()`

The os.getenv function retrieves the value of an environment variable by key, returning None or a provided default if the key is not found.
It prevents KeyError exceptions by offering a safe access pattern for optional configuration settings.
Since environment variables are always strings, any expected non-string types require explicit conversion after retrieval.

import os 

os.environ["APP_API_KEY"] = "ab12cd34"

api_key = os.getenv("APP_API_KEY")
debug_mode = os.getenv("DEBUG_MODE", False)

if api_key:
    print(f"API key found: {api_key[:4]}... (masked)")
else:
    print("APP_API_KEY not set.")

print(f"Debug mode: {debug_mode}")

Accessing Environment Variables with `os.environ`

os.environ behaves like a dictionary mapping environment variable names to their string values.
Accessing a missing key via os.environ['KEY'] raises a KeyError, making it suitable for mandatory variables.
One should guard against missing keys by checking membership or catching KeyError to handle critical configuration errors.

import os

try:
    java_home = os.environ["JAVA_HOME"]
    print(java_home)
except KeyError:
    print("JAVA_HOME environment variable not set.")

Setting Environment Variables Within Python

While environment variables are typically set externally, os.environ can be modified at runtime to affect the current process and its children.
Assigning to os.environ['KEY'] makes the variable available to any subprocesses spawned by the script.
Deleting an entry from os.environ removes it for subsequent operations within the process, but changes do not persist after the script exits.

import os
import sys
import subprocess

print(f"Initial MY_CUSTOM_VAR: {os.getenv("MY_CUSTOM_VAR")}")

os.environ["MY_CUSTOM_VAR"] = "SetByOurScript"
print(f"Updated MY_CUSTOM_VAR: {os.getenv("MY_CUSTOM_VAR")}")

result = subprocess.run([
    sys.executable,
    "-c",
    """import os
print(f"Child sees MY_CUSTOM_VAR: {os.getenv("MY_CUSTOM_VAR")}")"""  
])

result.stdout

del os.environ["MY_CUSTOM_VAR"]
print(f"After deletion, MY_CUSTOM_VAR: {os.getenv("MY_CUSTOM_VAR")}")

Using dotenv to Manage Local Environment Files

The python-dotenv library lets you keep sensitive and environment-specific values in a .env file instead of the shell.
A .env file lives alongside your script and contains lines like KEY=value; it's loaded at runtime into os.environ.
Install with pip install python-dotenv==1.1.0 (version included here so that we are all using the same, in other installations it may be omitted), then call load_dotenv() before any os.getenv calls.
This approach keeps your shell clean and makes it easy to commit example .env.example files without secrets.
Remember not to commit actual .env files with real secrets! Add them to .gitignore.

import os
from dotenv import load_dotenv

os.environ["MY_DOTENV_VAR"] = "setFromJupyter"

load_dotenv(override=True)

secret_dotenv_value = os.getenv("MY_DOTENV_VAR")

print(f"Retrieved MY_DOTENV_VAR with value {secret_dotenv_value}")

Common Pitfalls & How to Avoid Them

Environment variable names are always case-sensitive in Python, regardless of the underlying OS; inconsistent casing leads to unexpected missing values.
Forgetting that all environment variable values are strings can cause type errors; always convert to the intended type like int or bool after retrieval.
Accessing a missing mandatory variable via os.environ raises KeyError; avoid unhandled errors by checking membership or catching exceptions.
Storing highly sensitive secrets in plain environment variables carries security risks; for production use, consider managed secrets solutions like Vault or AWS Secrets Manager.

import os
from dotenv import load_dotenv

load_dotenv(override=True)

number_dotenv_value = os.getenv("MY_NUMBER_VAR")

print(type(number_dotenv_value))
# print(number_dotenv_value + 45) # Uncommenting will raise TypeError because number_dotenv_value is a string!

Working with CSV files/shaare/lDIjmw

python

Working with CSV files

CSV (Comma Separated Values) is a plain-text tabular format where each line is a row and fields are delimited (commonly by commas).
Widely used for spreadsheets, database exports, DevOps reports or inventories.
Python’s built-in csv module handles reading, writing, quoting, delimiters, headers, and dialects.
Always open files with newline='' and encoding='utf-8' for cross-platform consistency.

CSV Format Basics

Each row represents a record; fields separated by a delimiter (comma by default).
Optional header row defines column names.
Fields containing delimiters, quotes, or newlines must be quoted (usually with double quotes).
Alternative delimiters (tabs, semicolons) and quoting conventions are supported via dialects and parameters.

Reading CSV files with `csv.reader`

Iterates over rows, returning each as a list of strings.
Use next(reader) to skip or extract the header.
Accepts delimiter, quotechar, and other formatting parameters.

import csv
from pathlib import Path

csv_path = Path("servers.csv")

with csv_path.open("r", encoding="utf-8", newline="") as file:
    reader = csv.reader(file)
    header = next(reader)
    print(f"Header: {header}")

    for idx, row in enumerate(reader, start=1):
        print(f"Row {idx}: {row}")

Reading with `csv.DictReader`

Reads rows into dictionaries using the header row as keys.
Access fields by column name instead of index.
Optional fieldnames argument overrides header names.

import csv
from pathlib import Path

csv_path = Path("servers.csv")

with csv_path.open("r", encoding="utf-8", newline="") as file:
    dict_reader = csv.DictReader(file)
    print(f"Fieldnames: {dict_reader.fieldnames}")

    for idx, record in enumerate(dict_reader, start=1):
        print(f"Record {idx}: {record}")

Example of servers.csv

hostname,ip_address,role,status,tags
web01,10.0.1.5,webserver,running,"frontend,prod"
db01,10.0.2.10,database,maintenance,"backend,staging"

Writing with `csv.writer`

Write rows from lists using .writerow() or .writerows().
Open file with newline='' to avoid blank lines.
Control delimiter and quoting via parameters.

import csv
from pathlib import Path

data = [
    ["hostname", "ip_address", "role"],
    ["web02", "10.0.1.6", "webserver"],
    ["app01", "10.0.3.15", "application"],
]

out_path = Path("output_basic.csv")

with out_path.open("w", encoding="utf-8", newline="") as file:
    writer = csv.writer(file)
    writer.writerows(data)

Writing with `csv.DictWriter`

Write dictionaries using fieldnames to define header and column order.
Call .writeheader() before .writerows().

import csv
from pathlib import Path

records = [
    {
        "host": "web01",
        "port": "80",
        "status": "running"
    },
    {
        "host": "db02",
        "status": "maintenance",
        "tags": "prod,finance"
    }
]

out_dict_path = Path("output_dict.csv")
fieldnames = set()

for record in records:
    fieldnames = fieldnames | record.keys()

with out_dict_path.open("w", encoding="utf-8", newline="") as file:
    writer = csv.DictWriter(
        file,
        fieldnames=fieldnames,
        restval="undefined",
        extrasaction="ignore"
    )
    writer.writeheader()
    writer.writerows(records)

Working with YAML files/shaare/d2D1ag

python

Working with YAML files

YAML (“YAML Ain’t Markup Language”) focuses on human readability. Indentation replaces braces and brackets, comments are allowed, and quoting is usually optional.
DevOps tooling (Kubernetes, Ansible, GitHub Actions, many app configs) standardizes on YAML for its clarity and brevity.
JSON is excellent for machine-to-machine communication, but its strict syntax (no comments, heavy quoting) can feel verbose to humans maintaining config files.
Python’s standard library lacks YAML support; PyYAML is the community-standard package to fill that gap.

YAML Syntax and Features

Structure comes from spaces for indentation: tabs are discouraged.
Mappings use key: value; sequences use a leading hyphen (-) plus a space.
Scalars include strings, numbers, booleans (true / false, yes / no), and null.
Comments begin with #.
Multi-line scalars can be literal (|) or folded (>).
*Anchors (&) and aliases ()** avoid repetition by re-using defined blocks.
YAML is a superset of JSON: most valid JSON documents are also valid YAML.

import yaml, json

snippet = """
service: &svc
  name: user-api
  port: 8080
  enabled: true
  tags:
    - api
    - user
    - internal
staging:
  <<: *svc
  replicas: 2
production:
  <<: *svc
  replicas: 4
"""

parsed = yaml.safe_load(snippet)
print(parsed)

multiline_demo = """
literal: |
  line 1
  line 2
  line 3
folded: >
  This is a long string that
  could go out of screen, so
  we will break this up into
  multiple lines to improve
  readability.
"""
print("\n")
print(yaml.safe_load(multiline_demo))

Deserializing YAML with `yaml.safe_load`

Prefer yaml.safe_load (or passing Loader=yaml.SafeLoader) to prevent arbitrary-code execution; avoid yaml.load on untrusted data.
Accepts a string or an open text file handle and returns native Python structures.
Wrap calls in try / except yaml.YAMLError to catch malformed input.

import yaml
from pathlib import Path

compose = Path("compose.yaml")

try:
    with compose.open("r", encoding="utf-8") as file:
        config = yaml.safe_load(file)
        print(f"Compose version: {config["version"]}")

        for svc, options in config["services"].items():
            print(f"{svc.capitalize()} image\t: {options["image"]}")
except yaml.YAMLError as e:
    print("YAML error:")
    print(e)

Example of compose.yaml

version: '3.8'
services:
  web:
    image: myapp:latest
    ports:
      - "8000:80"
  redis:
    image: redis:alpine

Serializing Python Objects with `yaml.dump`

Use yaml.dump(obj, indent=2, default_flow_style=False, sort_keys=False) for readable block-style output.
Set stream to an open file handle to write directly; leave it None to return a string.

import yaml
from pathlib import Path

python_cfg = {
    "service": {"name": "listener-service", "port": 6789, "workers": 4, "enabled": False},
    "queues": ["high", "default", "low"],
    "retry_policy": None,
}

output_path = Path("listener_config.yaml")

with output_path.open("w", encoding="utf-8") as file:
    yaml.dump(python_cfg, file, sort_keys=False, default_flow_style=False)

Example of listener_config.yaml

service:
  name: listener-service
  port: 6789
  workers: 4
  enabled: false
queues:
- high
- default
- low
retry_policy: null

Working with JSON files/shaare/TU-K-w

python

Working with JSON files

JSON is the standard format for data exchange in web services and cloud APIs.
Python’s built-in json module provides functions to convert between JSON text and Python objects.
Key operations: parsing JSON from strings/files and serializing Python objects to JSON strings/files.

JSON Syntax and Python Mapping

JSON objects ({}) map to Python dict.
JSON arrays ([]) map to Python list.
JSON strings map to Python str, numbers to int or float.
true/false → True/False; null → None.
Keys in JSON objects must be double-quoted strings; no trailing commas.

Deserializing JSON

Use json.loads() to parse JSON strings into Python objects.
Raises json.JSONDecodeError on invalid JSON.
Common in DevOps for handling API response bodies.

import json

api_response_str = '{"status": "active", "instance_id": "i-12345", "cores": 4, "tags": ["web", "prod"]}'

try:
    data = json.loads(api_response_str)
    print(f"Parsed data type: {type(data)}")
    print(f"Instance ID: {data.get("instance_id", None)}")
    print(f"Tags: {data.get("tags", None)}")
except json.JSONDecodeError as e:
    print(f"Failed to parse JSON: {e}")

Parsing JSON Files

Use json.load() to read JSON from an open file object.
Always open files with encoding='utf-8' when dealing with JSON.
Wrap file operations in with to ensure proper closure.

import json
from pathlib import Path

config_path = Path("service_config.json")

with config_path.open("r", encoding="utf-8") as file:
    config_data = json.load(file)

for config in config_data:
    service_name = config.get("service", None)

    if service_name:
        print(f"Service: {service_name}")
        print(f"Enabled: {config.get("enabled", False)}")
        print('-' * 20)

Example of service_config.json

[
  {
    "service": "database",
    "port": 5432,
    "connection_pool": 10,
    "enabled": true
  },
  {
    "service": "cache",
    "port": 6379,
    "connection_pool": 5
  },
  {
    "service": "api",
    "port": 8080,
    "connection_pool": 3,
    "enabled": true
  },
  {
    "port": 5000,
    "connection_pool": 3,
    "enabled": true
  }
]

Serializing Python objects to JSON Strings

Use json.dumps() to convert Python objects to JSON strings.
indent makes output human-readable; sort_keys=True orders keys alphabetically.

import json

python_data = {
    "deployment": "frontend-v2",
    "replicas": 3,
    "ports": [80, 443],
    "health_check": True,
    "logs_enabled": None
}

print(f"Simple JSON:\n{json.dumps(python_data)}")
print("\n")
print(f"Pretty JSON:\n{json.dumps(python_data, indent=2, sort_keys=True)}")

Serializing Python objects to JSON Files

Use json.dump() to write Python objects directly to files.
Pass the file handle and optional indent for formatting.

import json
from pathlib import Path

output = {
    "status": "complete",
    "items_processed": 1492,
    "errors": []
}
output_path = Path("run_summary.json")

with output_path.open("w", encoding="utf-8") as file:
    json.dump(output, file, indent=2)

Regex/shaare/YC8opA

python

Regex Essentials: Overview

Regular expressions (regex) are a language for defining text search patterns.
Python’s re module provides functions like search (find anywhere) and match (anchored at start).
Patterns include literals, metacharacters (. ^ $ * + ? [] \), character classes (\d, \w, \s), and quantifiers (*, +, ?, {n,m}).
Greedy quantifiers (*, +) match as much as possible; non-greedy (*?, +?) as little as possible.

Introduction to `re.search()` vs `re.match()`

re.search(pattern, text) scans the entire string for the first occurrence.
re.match(pattern, text) checks only at the beginning of the string.
re.findall() and re.finditer() let you retrieve every occurrence of a pattern.
Always use raw strings (r"...") to define regex patterns, avoiding Python string escapes interfering with regex.

import re

line = "WARN: Disk usage at 91%"
pattern = r"WARN"

print(f"search '{pattern}':", bool(re.search(pattern, line)))
print(f"match '{pattern}':", bool(re.match(pattern, line)))

Common Metacharacters

. matches any character (except newline).
^ anchors at start of string.
$ anchors at end of string.
[] defines a set or range of characters, e.g. [A-Z].
\ escapes metacharacters or introduces special sequences.

import re

test = "Error code: E1234. cxge"

print(f"Dot matches any character: {re.findall(r"c..e", test)}")
print(f"Start anchor (finds): {re.findall(r"^Error", test)}")
print(f"Start anchor (does not find): {re.findall(r"^E1234", test)}")
print(f"End anchor: {re.findall(r"cxge$", test)}")
print(f"Character set: {re.findall(r"[E0-9]+", test)}")

Special Sequences

\d digit (0–9), \D non-digit.
\w word character (letters, digits, underscore), \W non-word.
\s whitespace, \S non-whitespace.
\b word boundary (zero-width match).

import re

text = "The cat scattered 1024 catalogues."

print(f"Digits: {re.findall(r"\d+", text)}")
print(f"Word characters: {re.findall(r"\w+", text)}")
print(f"Whitespace: {re.findall(r"\s+", text)}")
print(f"Word boundary: {re.findall(r"\bcat\b", text)}")

Quantifier Cheat-Sheet

Quantifier	Meaning	Greedy?	Non-greedy form	Meaning
`?`	0 or 1 of the preceding token	Yes	`??`	as few as possible (0 or 1)
`*`	0 or more of the preceding token	Yes	`*?`	as few as possible (including zero)
`+`	1 or more of the preceding token	Yes	`+?`	as few as possible (at least one)
`{n}`	exactly n of the preceding token	-	-	-
`{n,}`	n or more of the preceding token	Yes	`{n,}?`	n or more, but as few as possible
`{n,m}`	between n and m of the preceding token	Yes	`{n,m}?`	between n and m, but as few as possible

import re

text = "aaaa"

print(re.findall(r"a?", text))
print(re.findall(r"a*", text))
print(re.findall(r"a+", text))
print(re.findall(r"a{2}", text))
print(re.findall(r"a{1,3}", text))

print(f"Non-greedy a*: {re.findall(r"a*?", text)}")
print(f"Non-greedy a+: {re.findall(r"a+?", text)}")
print(f"Non-greedy a{{1,3}}?: {re.findall(r"a{1,3}?", text)}")

Quantifiers & Greedy vs Non-Greedy

* / + / {n,} are greedy: match as much as possible.
Append ? (*? / +? / {n,}?) to make them non-greedy: match as little as possible.
Greedy quantifiers match the longest possible string that satisfies the pattern. Adding a ? after them makes them non-greedy (or lazy), matching the shortest possible string.

import re

html = "<p>One</p><p>Two</p><></>"

print(f"Greedy: {re.findall(r"<.*>", html)}")
print(f"Non-greedy: {re.findall(r"<.*?>", html)}")

Capturing Groups and Back-References

Regex lets you check for patterns, but often you need to extract pieces of the match (e.g., IP vs port).
Capturing groups, defined with (), let you isolate and retrieve substrings from a match.
Named groups improve readability by giving meaningful labels instead of relying on group numbers.
Non-capturing groups (?:…) let you apply grouping logic without cluttering captures.
Back-references allow you to match the same text twice (or more) within one pattern.

Capturing Groups

Parentheses () both group and capture the matched text inside them.
Groups are numbered by their opening (, starting at 1; group 0 is the entire match.
Use match.group(n) for a single group or match.groups() to get all captures as a tuple.
Capturing is essential when you need to feed specific substrings into further processing.

import re

log_entry = "Ts=2023-10-27T12:00:00Z Level=ERROR User=admin Action=login_fail IP=10.0.0.5"

# Our goal:
# 1. Group 1: The log level
# 2. Group 2: The user name
# 2. Group 3: The IP address

pattern = r"Level=(\w+)\s+User=(\w+).*?\s+IP=([\d\.]+)"

match = re.search(pattern, log_entry)

if match:
    print(f"Full match: {match.group(0)}")
    print(f"Level: {match.group(1)}")
    print(f"User: {match.group(2)}")
    print(f"IP: {match.group(3)}")
    print(f"All groups: {match.groups()}")

Named Capturing Groups

Syntax: (?P<name>pattern) assigns a label to a capturing group.
Access by name: match.group('name') makes code self-documenting.
match.groupdict() returns a dict of all named captures.
You can still use numeric indices if needed, but names help avoid off-by-one errors.

import re

log_entry = "Ts=2023-10-27T12:00:00Z Level=ERROR User=admin Action=login_fail IP=10.0.0.5"

# Add labels to:
# 1. Group 1: The log level
# 2. Group 2: The user name
# 2. Group 3: The IP address

pattern = r"Level=(?P<level>\w+)\s+User=(?P<user>\w+).*?\s+IP=(?P<ip>[\d\.]+)"

match = re.search(pattern, log_entry)

if match:
    print(f"Full match: {match.group(0)}")
    print(f"Level: {match.group("level")}")
    print(f"User: {match.group("user")}")
    print(f"IP: {match.group("ip")}")
    print(f"All groups: {match.groups()}")
    print(f"Group dictionary: {match.groupdict()}")

Non-Capturing Groups

Use (?:pattern) when you need grouping for quantifiers or alternation without capturing.
Keeps your capture numbers focused on what you actually want.
Prevents unwanted None entries in match.groups() when using optional parts.

import re

log_line1 = "report.txt Status: OK"
log_line2 = "report Status: OK"

# Our goal:
# 1. Group 1: The stem of the filename, with .txt being an optional string
# 2. Group 2: The status code

pattern = r"^(.+?)(?:\.txt)?\s+Status:\s+(.+)$"

match_line1 = re.search(pattern, log_line1)
match_line2 = re.search(pattern, log_line2)

if match_line1: print(match_line1.groups())
if match_line2: print(match_line2.groups())

Back-references

Refer back to a previous capture using \1, \2, … or (?P=name) for named groups.
Useful for matching repeated words or balanced constructs (e.g., open/close tags).
Can make patterns more complex but powerful for advanced text validation.

import re

text = "This this is a test test."
pattern_numbers = r"(?i)\b(\w+)\s+\1\b"
pattern_labels = r"(?i)\b(?P<word>\w+)\s+(?P=word)\b"

print(f"Doubled words: {re.findall(pattern_numbers, text)}")
print(f"Doubled words: {re.findall(pattern_labels, text)}")

html = "<p>Paragraph</p> <b>Bold</b>"
pattern_tags = r"<(\w+)>(.*?)</\1>"

print(f"Tags: {re.findall(pattern_tags, html)}")

Search, Split, and Substitute

re.findall() and re.finditer() let you retrieve every occurrence of a pattern.
re.split() handles complex delimiters beyond simple string splits.
re.sub() performs powerful search-and-replace operations, including reuse of captured groups.

Finding All Matches

re.findall(pattern, string) returns a list of all non-overlapping matches:
- No groups → list of matched substrings.
- With groups → list of tuples of captured substrings.
re.finditer(pattern, string) returns an iterator of match objects, giving access to .group(), positions, named groups, etc., and is more memory-efficient for large inputs.

import re

text = "Errors found: 404, 500, 403, 500. User IDs: user123, admin99."
config = "timeout=60 retries=3 workers=5"

# Find all error codes:
print(f"Numbers found: {re.findall(r"\d+", text)}")

# findall with groups:
print(f"Key-value pairs: {re.findall(r"(\w+)=(\w+)", config)}")

# finditer
for match in re.finditer(r"(\w+)=(\w+)", config):
    print(f"Whole match: {match.group(0)}; key: {match.group(1)}; value: {match.group(2)} - at {match.start()}-{match.end()}")

Splitting Strings

Use re.split(pattern, string) to break a string on a regex pattern, not just a fixed substring.
Always use a raw string literal so backslashes reach the regex engine.
Simple single-character delimiters: use a character class (never captured), e.g. r"\s*[,;]\s*".
Complex delimiters (alternation or multi-character): group with non-capturing parentheses, e.g. r"\s*(?:foo|bar|baz)\s*", so they aren’t included in the result list.
Including delimiters: wrap your delimiter in a capturing group, e.g. r"\s*([,;])\s*", to have the separators appear in the split output.
Summary:
- No parentheses or a non-capturing group → delimiters are removed.
- Capturing group → delimiters appear in the split list.

import re

data = "item1 , item2; item3 ,item4 ;item5"

# 1. Split on comma and semi-colon
pattern1 = r"\s*[,;]\s*"
print(f"Character class split: {re.split(pattern1, data)}")

# 2. Capturing the delimiter
pattern2 = r"\s*([,;])\s*"
print(f"Capturing group split: {re.split(pattern2, data)}")

html = """
<p class='hello'>First paragraph.</p>
<b class='world'>Second paragraph.</b>
End.
"""

pattern3 = r"<.*?class='(?:hello|world)'.*?>|</[pb]>"
print(f"HTML non-capturing split: {re.split(pattern3, html)}")

Substituting Text

re.sub(pattern, replacement, string, count=0) replaces all (or a limited number) of matches.
count controls how many replacements to make (default 0 = all).
Back-references (\1, \g<name>) let you reorder or reuse captured text in the replacement.

import re

text = "User IDs: user123, user456, user123457689. Contact admin789 for help."

# Basic substitution
redacted = re.sub(r"user\d+", "[REDACTED_USER]", text)
print(f"Result of redacting: {redacted}")

# Back-reference for reusing information
redacted_partially = re.sub(r"(u)ser\d+(\d{2})", r"\1[REDACTED_USER]\2", text)
print(f"Result of redacting: {redacted_partially}")

# Limited count of substitutions
redacted_only_two = re.sub(r"(u)ser\d+(\d{2})", r"\1[REDACTED_USER]\2", text, count=2)
print(f"Result of redacting: {redacted_only_two}")

# Named groups for substitution
date_text = "Start: 2023-10-27, End: 2024-01-15"
# Current format YYYY-MM-DD
# Target format DD/MM/YYYY

date_pattern_named = r"(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})"
replacement_format_named = r"\g<day>/\g<month>/\g<year>"
reformatted_date = re.sub(date_pattern_named, replacement_format_named, date_text)

print(f"Result of date transformation: {reformatted_date}")

Read/Write Text Files/shaare/VogSQA

python

Read/Write Text Files

Use open() to read/write text files with proper modes and encoding.
Specify encoding='utf-8' for portability.
Leverage with to ensure files close automatically.
Read via iteration, .read(), .readline(), .readlines().
Write via .write() or .writelines(), managing newlines manually.

try:
    with open("config.txt", "w", encoding="utf-8") as file:
        file.write("Setting=Value\n")
        file.write("Other=Another\n")

    with open("config.txt", "r", encoding="utf-8") as file:
        content = file.read()
        print(f"Contents of file:")
        print(content)
except OSError as e:
    print(f"File error: {e}")

File Modes

'r': read text (default), error if file missing.
'w': write text, create or truncate.
'a': append text, create if missing.
'x': exclusive create, error if exists (good to prevent overwrites).
'b': binary mode variant (e.g. 'rb', 'wb').
'+': update mode, allows read/write (e.g. 'r+', 'w+').

Understanding `+`

Mode	Reads?	Writes?	Creates if missing?	Truncates on open?
r	✅	❌	❌	❌
r+	✅	✅	❌	❌
w	❌	✅	✅	✅
w+	✅	✅	✅	✅
a	❌	✅	✅	❌
a+	✅	✅	✅	❌

from pathlib import Path

path = Path("mode_demo.txt")

with path.open(mode="w", encoding="utf-8") as file:
    file.write("Initial line\n")

with path.open(mode="a", encoding="utf-8") as file:
    file.write("Appended line\n")

try:
    with path.open(mode="x", encoding="utf-8") as file:
        file.write("This will fail if file exists\n")
except FileExistsError as e:
    print(e)

Reading Text Files

Iteration: for line in f:
- When to use: Ideal for processing large files line by line without loading the entire file into memory; lazy and very memory-efficient.
f.read(size = -1)
- size can be used to specify the maximum number of characters to read; if negative or omitted, reads the entire file.
- When to use: When you need to grab a chunk of text (e.g. next 1024 chars). Good for bulk reads; but beware of high memory usage if you read the whole file at once.
f.readline(size = -1)
- size can be used to specify the maximum number of characters to read from the line; if negative or omitted, reads the full line up to and including the newline.
- When to use: When you want one line at a time but need to guard against overly long lines. Returns an empty string when you reach EOF.
f.readlines(hint = -1)
- hint can be used to define the approximate total number of bytes to read; if negative or omitted, reads all lines.
- When to use: When the file is small or moderate in size and you want a list of all lines for easy indexing or list comprehensions. Not recommended for very large files (may exhaust memory).

from pathlib import Path

sample = Path("read_demo.txt")
sample.write_text("First\nSecond\nThird\n", encoding="utf-8")

print("Iteration for reading:")
with sample.open(mode="r", encoding="utf-8") as file:
    for line in file:
        print(f" -> {line.strip()}")

print("read() for reading:")
with sample.open(mode="r", encoding="utf-8") as file:
    print(file.read())

print("readline() for reading:")
with sample.open(mode="r", encoding="utf-8") as file:
    print(file.readline())

print("readlines() for reading:")
with sample.open(mode="r", encoding="utf-8") as file:
    print(file.readlines())

Writing Text Files

f.write(s)
- s is the string to write; does not add a newline automatically.
- When to use: When writing single strings or building content piece by piece. Returns the number of characters written, so you can verify success.
f.writelines(lines: Iterable[str]) -> None
- lines can be any iterable of strings; does not add newlines for you.
- When to use: When you need to write a batch of strings at once (for example, a list of CSV rows). It's more efficient than multiple calls to .write(), but you must include \n at the end of each string if you want line breaks.

from pathlib import Path

write_demo = Path("write_demo.txt")

with write_demo.open(mode="w", encoding="utf-8") as file:
    file.write("Line A\n")
    file.write("Line B\n")

lines_to_write = [
    "user,ip,role",
    "alice,10.0.0.0,admin",
    "bob,10.0.0.1,dev",
    "charlie,10.0.02,audit"
]
with write_demo.open(mode="w", encoding="utf-8") as file:
    file.writelines(f"{line}\n" for line in lines_to_write)

Filesystem Paths/shaare/jvk1Ew

python

Working with Filesystem Paths in Python

Manipulating paths as plain strings is error-prone and OS-specific.
pathlib provides an object-oriented, cross-platform way to handle paths.
Path objects offer intuitive operators and methods for most filesystem tasks.

Limitations of String Paths and `os.path`

Using os.path.join, os.path.exists, etc., requires multiple function calls.
Code readability suffers when paths are manipulated as plain strings.
OS differences ("/" vs "\" separators) must be handled explicitly.

Creating and Combining `Path` Objects

Import Path from pathlib.
Create Path objects for directories and files.
Use the / operator to join path components cleanly.

from pathlib import Path

config_dir = Path(".")
filename = "settings.yaml"

print(config_dir, type(config_dir))

config_path = config_dir / filename
print(config_path.resolve())

Inspecting Path Properties

.exists(), .is_file(), .is_dir() check path state.
.parent, .name, .stem, .suffix expose components.
.resolve() returns the absolute, canonical path.

service_log = Path("/var/log/app/service.log")

print(f"Exists: {service_log.exists()}")
print(f"Is file? {service_log.is_file()}")
print(f"Is directory? {service_log.is_dir()}")
print(f"Parent: {service_log.parent}")
print(f"Name: {service_log.name}")
print(f"Stem: {service_log.stem}")
print(f"Suffix: {service_log.suffix}")
print(f"Resolved absolute path: {service_log.resolve()}")

Listing Directory Contents

.iterdir() yields immediate children of a directory.
.glob(pattern) finds entries matching a shell-style pattern.
Use "**/*.ext" in glob for recursive searches.

course_parent = Path("..")

print("Immediate children:")

for i, child in enumerate(course_parent.iterdir()):
    print(f"  {child.name} - {child.is_dir()}")
    if i >= 4: break

print("Python files recursively:")

for i, child in enumerate(course_parent.glob("**/*.ipynb")):
    print(f"  {child}")
    if i >= 10: break

Reading and Writing Files with `Path`

.write_text() and .read_text() handle simple text I/O.
Use p.open(mode="a") for more control (e.g., appending, binary mode).
Path methods automatically manage file open/close.

test_file = Path("demo.txt")

test_file.write_text("Hello, from pathlib!", encoding="utf-8")
print(f"Read back: {test_file.read_text(encoding="utf-8")}")

with test_file.open(mode="a", encoding="utf-8") as file:
    file.write("\nAppended line!")

print(f"Read back: {test_file.read_text(encoding="utf-8")}")

test_file.unlink()

Declarative Logging/shaare/gWRxQw

python

Declarative Logging Configuration

Declarative configuration separates setup from code, making it easier to maintain and adjust.
Python’s logging.config module supports both INI-style (fileConfig) and dictionary-based (dictConfig) configurations.
Configuration objects can be loaded from files (INI, JSON, YAML) or defined in code.
Benefits include environment-specific overrides, less boilerplate, and clearer visibility of logger/handler relationships.

INI-Style Configuration with `fileConfig`

Uses an INI-format file to define loggers, handlers, and formatters.
Sections: [loggers], [handlers], [formatters], plus one section per named logger/handler/formatter.
Good for simple setups and backwards compatibility, but less flexible for dynamic structures.

Dictionary-Based Configuration with `dictConfig`

Configuration defined as a Python dict, offering full programmatic control.
Keys: version, disable_existing_loggers, and mappings for formatters, handlers, loggers, and optionally root.
Easy to build or modify at runtime, and to serialize/deserialize via JSON/YAML.

Loading Configuration from JSON or YAML

Store the same dict-based schema in a JSON/YAML file for external editing.
Read and parse the file, then pass the resulting dict to dictConfig.
Enables separation of concerns: ops teams can tweak logging without touching code.

Dynamic and Programmatic Adjustments

You can modify the config dict at runtime before calling dictConfig.
Handlers, formatters, and levels can be added, removed, or tweaked based on environment variables or feature flags.
Example: switch file logging on/off depending on a DEBUG flag.

import logging
import logging.config
import json
from typing import Any, Dict

"""
# Uncomment to test INI configuration

# Declarative logging configuration - INI-file
print("Declarative configuration using INI files")
print("---------\n")

config_path = "declarative-config.ini"

logging.config.fileConfig(
    fname=config_path,
)

app_logger = logging.getLogger("app")
app_logger.debug("INI-style fileConfig is working!")
"""

"""
# Uncomment to test Dictionary configuration

# Declarative logging configuration - Dictionary config
print("Declarative configuration using dictionary config")
print("---------\n")

dict_config: Dict[str, Any] = {
    "version": 1,
    "disable_existing_loggers": False,
    "formatters": {
        "simple": {"format": "%(levelname)-8s - %(message)s"}
    },
    "handlers": {
        "console": {
            "class": "logging.StreamHandler",
            "level": "INFO",
            "formatter": "simple",
            "stream": "ext://sys.stdout",
        }
    },
    "loggers": {
        "config.dict": {
            "level": "DEBUG",
            "handlers": ["console"],
        }
    },
}

logging.config.dictConfig(dict_config)
config_logger = logging.getLogger("config.dict")
config_logger.debug("dictConfig setup successfully")
config_logger.info("Info goes to console")
"""

"""
# Uncomment to test JSON configuration

# Declarative logging configuration - JSON config
print("Declarative configuration using JSON config")
print("---------\n")

config_path = "declarative-config.json"

with open(config_path, "r") as config_file:
    json_config = json.load(config_file)

logging.config.dictConfig(json_config)
config_logger = logging.getLogger("config.json")
config_logger.debug("JSON config setup successfully")
config_logger.info("Info goes to console")
"""

# Dynamically building config
print("Dynamically building config")
print("---------\n")

base_config: Dict[str, Any] = {
    "version": 1,
    "disable_existing_loggers": True,
    "handlers": {},
    "formatters": {},
    "loggers": {},
}

base_config["formatters"]["simple"] = {
    "format": "%(levelname)-8s - %(message)s"
}

base_config["handlers"]["console"] = {
    "class": "logging.StreamHandler",
    "level": "DEBUG",
    "formatter": "simple",
    "stream": "ext://sys.stdout",
}

base_config["loggers"]["config.dynamic"] = {
    "level": "WARNING",
    "handlers": ["console"],
}

def is_debug():
    return True

if is_debug():
    for logger, _config in base_config["loggers"].items():
        base_config["loggers"][logger]["level"] = "DEBUG"

logging.config.dictConfig(base_config)
config_logger = logging.getLogger("config.dynamic")
config_logger.debug("Dynamic config setup successfully")
config_logger.info("Info goes to console")

Example of declarative-config.ini

[loggers]
keys=root,app

[handlers]
keys=consoleHandler,nullHandler

[formatters]
keys=simpleFormatter

[logger_root]
level=INFO
handlers=consoleHandler

[logger_app]
level=DEBUG
handlers=nullHandler
qualname=app

[handler_consoleHandler]
class=StreamHandler
level=DEBUG
formatter=simpleFormatter
args=(sys.stdout,)

[handler_nullHandler]
class=NullHandler
level=NOTSET
formatter=
args=()

[formatter_simpleFormatter]
format=%(asctime)s - %(name)s - %(levelname)-8s - %(message)s

Example of declarative-config.json

{
  "version": 1,
  "disable_existing_loggers": false,
  "formatters": {
    "simple": { "format": "%(levelname)-8s - %(message)s" },
    "detailed": {
      "format": "%(asctime)s %(name)s [%(levelname)s]: %(message)s",
      "datefmt": "%Y-%m-%d %H:%M:%S"
    }
  },
  "handlers": {
    "console": {
      "class": "logging.StreamHandler",
      "level": "INFO",
      "formatter": "detailed",
      "stream": "ext://sys.stdout"
    }
  },
  "loggers": {
    "config.json": {
      "level": "DEBUG"
    }
  },
  "root": {
    "level": "DEBUG",
    "handlers": ["console"]
  }
}

Structured Logging/shaare/3E69ww

python

Introduction to Structured Logging

Plain-text logs are hard to parse and brittle to format changes.
Structured logging records events as key-value data, making machine parsing trivial.
JSON is a de-facto standard: human-readable yet easily ingested by ELK, Splunk, DataDog, etc.
Python’s python-json-logger integrates JSON output into the standard logging workflow.

Configuring `python-json-logger`

Install via pip install python-json-logger==3.3.0 (for consistency, I'm pinning the version; removing it will install the latest version available).
Replace logging.Formatter with pythonjsonlogger.JsonFormatter.
Specify a format string listing the LogRecord attributes you want as JSON keys.
Attach to any Handler just like a normal Formatter.

Logging with Extra Context

Pass a dict to the extra parameter of logger.<level>().
Keys in extra become top-level JSON fields.
Use for request IDs, user IDs, session tokens, or any domain data.

Logging Exceptions as JSON

Use logger.exception(...) inside an except block.
The JsonFormatter automatically adds an exc_info key with the traceback.
This preserves full error context for downstream analysis.

# Configuring python-json-logger
print("Configuring python-json-logger")
print("---------\n")

import logging
import sys
from pythonjsonlogger.json import JsonFormatter

json_logger = logging.getLogger("demo.json")
json_logger.setLevel(logging.INFO)

handler = logging.StreamHandler(sys.stdout)
json_formatter = JsonFormatter(
    "{asctime}{levelname}{message}",
    style="{",
    json_indent=4,
    rename_fields={"asctime": "timestamp", "levelname": "level"},
)
handler.setFormatter(json_formatter)

json_logger.addHandler(handler)

json_logger.info("Structured logging initialized")

# Logging with extra context
print("Logging with extra context")
print("---------\n")

extra_context = {
    "user_id": "devops1",
    "request_id": "request-12345abc",
    "source_ip": "10.0.0.5",
}

json_logger.warning(
    "Request took longer than 5s to complete",
    extra=extra_context,
)

# Logging exceptions as JSON
print("Logging exceptions as JSON")
print("---------\n")

try:
    result = 1 / 0
except ZeroDivisionError:
    json_logger.exception(
        "Unexpected calculation error",
        extra={"operation": "division"},
    )

Logging to Files/shaare/oXLUGw

python

Logging to Files

Basic File Logging with `FileHandler`

Use logging.FileHandler to write log records to a file.
mode='a' (append) preserves existing logs; mode='w' (write) overwrites on each run.
You can specify encoding (e.g., 'utf-8') and delay=True to open the file only on first write.

Size-Based Rotation with `RotatingFileHandler`

RotatingFileHandler rotates when the file reaches maxBytes.
backupCount determines how many old files to keep (.1, .2, …).
New rotations rename existing backups, deleting the oldest beyond backupCount.

Time-Based Rotation with `TimedRotatingFileHandler`

TimedRotatingFileHandler rotates based on elapsed time (when, interval).
Common when values (case insensitive): 'S', 'M', 'H', 'D', 'midnight', 'W0'-'W6'
- 'S' – Rotate every N seconds (as given by interval), useful for very short-lived scripts or testing.
- 'M' – Rotate every N minutes, good for high-volume services where hourly isn’t fine-grained enough.
- 'H' – Rotate every N hours, often used for long-running daemons that batch logs hourly.
- 'D' – Rotate every N days, for simple daily log files without tying to midnight.
- 'midnight' – Rotate once per day exactly at midnight (local time), regardless of interval, ideal for calendar-aligned logs.
- 'W0'–'W6' – Rotate weekly on a specific weekday, where W0 = Monday through W6 = Sunday. Use interval weeks between rotations.
backupCount limits number of rotated files; use .suffix to customize timestamp format.

import logging
import logging.handlers
import os
import time

def cleanup_log_files(base_name: str):
    for file_name in os.listdir("."):
        if file_name.startswith(base_name):
            os.remove(file_name)

# Basic logging with FileHandler
print("Basic logging with FileHandler")
print("-------\n")

basic_logger = logging.getLogger("file.basic")
basic_logger.setLevel(logging.DEBUG)

basic_fh = logging.FileHandler(
    "basicfile.log", delay=True, encoding="utf-8"
)
basic_fh.setLevel(logging.INFO)

basic_logger.addHandler(basic_fh)

basic_logger.info("INFO: will be written to file")

# Size-based log rotation with RotatingFileHandler
print("Size-based log rotation with RotatingFileHandler")
print("-------\n")

rotating_logs_filename = "rotatingfile.log"

cleanup_log_files(rotating_logs_filename)

rotating_logger = logging.getLogger("file.rotating")
rotating_logger.setLevel(logging.DEBUG)

rotating_fh = logging.handlers.RotatingFileHandler(
    rotating_logs_filename,
    maxBytes=500,
    backupCount=2,
    encoding="utf-8",
)
rotating_fh.setFormatter(
    logging.Formatter("%(levelname)-8s %(message)s")
)

rotating_logger.addHandler(rotating_fh)

for i in range(30):
    rotating_logger.info(f"Entry {i}: {'Z' * 50}")
    time.sleep(0.05)

# Time-based log rotation with TimedRotatingFileHandler
print("Time-based log rotation with TimedRotatingFileHandler")
print("-------\n")

timed_rotating_logs_filename = "timedrotatingfile.log"

cleanup_log_files(timed_rotating_logs_filename)

timed_rotating_logger = logging.getLogger("file.timed")
timed_rotating_logger.setLevel(logging.DEBUG)

timed_rotating_fh = logging.handlers.TimedRotatingFileHandler(
    timed_rotating_logs_filename,
    when="s",
    interval=3,
    backupCount=2,
    encoding="utf-8",
)
timed_rotating_fh.setFormatter(
    logging.Formatter("%(levelname)-8s %(message)s")
)

timed_rotating_logger.addHandler(timed_rotating_fh)

for i in range(30):
    timed_rotating_logger.info(f"Entry {i}: {'Z' * 50}")
    time.sleep(0.5)

Log Levels in Practice/shaare/sMZ4dg

python

Log Levels in Practice

Python defines five standard levels with increasing severity:
- DEBUG (10): Detailed diagnostic information.
- INFO (20): Confirmation that things are working normally.
- WARNING (30): An indication of potential problems or deprecation.
- ERROR (40): A failure in a specific operation.
- CRITICAL (50): A serious error causing program termination.
NOTSET (0) causes a logger to inherit its parent’s effective level.
Appropriate use of these levels lets you adjust verbosity without changing code.

Two-Stage Filtering: Logger vs Handler

Logger Level: First gate: records below logger.level are discarded immediately.
Handler Level: Second gate: each handler only emits records at or above its handler.level.
This allows, for example, DEBUG messages to be logged to a file but only WARNING and above to the console.

Configuring Logger & Handlers

Use logger.setLevel(...) to control which messages the logger accepts.
Use handler.setLevel(...) to control which accepted messages each handler emits.
Attach multiple handlers for different ouputs (e.g., console vs file) with independent levels.


# Log levels in practice

import logging
import sys

print("Log levels in practice")
print("------\n")

for lvl in (
    logging.DEBUG,
    logging.INFO,
    logging.WARNING,
    logging.ERROR,
    logging.CRITICAL,
):
    print(
        f"{logging.getLevelName(lvl):8} = {lvl}"
    )

# Two-stage filtering

print("\n")
print("Two stage filtering")
print("------\n")

filter_logger = logging.getLogger("demo.filter")
filter_logger.setLevel(logging.INFO)

stream_handler = logging.StreamHandler(sys.stdout)
stream_handler.setLevel(logging.ERROR)

filter_logger.addHandler(stream_handler)

filter_logger.info("INFO: will not be shown")
filter_logger.error("ERROR: will be shown")

# Configuring logs and handlers

print("\n")
print("Configuring logs and handlers")
print("------\n")

data_logger = logging.getLogger("demo.data")
data_logger.setLevel(logging.DEBUG)

data_sh = logging.StreamHandler(sys.stdout)
data_sh.setLevel(logging.ERROR)

data_fh = logging.FileHandler("process.log", "w")
data_fh.setLevel(logging.INFO)

data_logger.addHandler(data_sh)
data_logger.addHandler(data_fh)

data_logger.debug("DEBUG: will be dropped")
data_logger.info("INFO: file only")
data_logger.warning("WARNING: file only")
data_logger.error("ERROR: file and console")
data_logger.critical("CRITICAL: file and console")

Logging Anatomy/shaare/2wicQA

python

Python Logging Anatomy

Python’s logging module has five core components: Loggers, Log Records, Handlers, Formatters and Filters.
Loggers are hierarchical objects your code calls to emit messages at various severity levels.
Each call to a logger creates a LogRecord capturing metadata: level, message, timestamp, source, thread/process IDs, exception info, etc.
Handlers attached to loggers dispatch records to destinations (console, files, network).
Formatters define how a LogRecord is rendered into the final string emitted by a handler.

import logging

root_logger = logging.getLogger()
print(f"Root logger: name={root_logger.name}, level={logging.getLevelName(root_logger.level)}")

app_logger = logging.getLogger("app")
print(f"App logger: name={app_logger.name}, level={logging.getLevelName(app_logger.level)}, parent={app_logger.parent.name}")

network_logger = logging.getLogger("app.network")
print(f"Network logger: name={network_logger.name}, level={logging.getLevelName(network_logger.level)}, parent={network_logger.parent.name}")

Log Records

Each logging call (logger.info(), logger.error(), etc.) creates a LogRecord object behind the scenes.
A LogRecord includes attributes such as name, levelno, levelname, pathname, lineno, funcName, asctime, message, plus any user-supplied extra data.
Handlers and formatters use these attributes to filter and render the log entry.

from logging import LogRecord

record = LogRecord(
    name="app.network",
    level=logging.ERROR,
    pathname="/path/to/file.py",
    lineno=43,
    msg="My log message",
    args=(),
    exc_info=None
)

print("LogRecord contents:")

for attr in ("name", "levelname", "pathname", "msg"):
    print(f"    {attr} => {getattr(record, attr)}")

Handlers

Handlers determine where log records are sent (console, file, network, etc.).
Each handler has its own level: it filters out any record whose level is below its threshold.
Common handlers include:
- StreamHandler (console),
- FileHandler (single file),
- RotatingFileHandler,
- TimedRotatingFileHandler,
- SysLogHandler,
- HTTPHandler,
- NullHandler.

import sys

demo_logger = logging.getLogger("handler_demo")
demo_logger.setLevel(logging.INFO)
demo_logger.handlers.clear()

stream_handler = logging.StreamHandler(sys.stdout)
stream_handler.setLevel(logging.DEBUG)
demo_logger.addHandler(stream_handler)

demo_logger.debug("Debug message: will not show")
demo_logger.info("Info message: will show")
demo_logger.warning("Warning message: will show")
demo_logger.error("Error message: will show")

Formatters

Formatters specify the layout of the final log message string.
You define a format string using %(attribute)s or %(attribute)d placeholders.
Common attributes: asctime, levelname, name, message, filename, lineno, funcName, process, thread.

formatter = logging.Formatter(
    "%(asctime)s - %(name)s - %(levelname)s - %(message)s",
    datefmt="%Y-%m-%d %H:%M:%S"
)

stream_handler.setFormatter(formatter)

demo_logger.warning("Formatted warning")

Context managers/shaare/wo-q7g

python

Context Managers

When opening files or acquiring locks, resources must be released even if errors occur.
Manual try...finally ensures cleanup but adds boilerplate and potential for mistakes.
Forgetting to initialize the resource variable or to call cleanup in every exit path leads to leaks, deadlocks, or corrupted data.
Cleaner patterns reduce noise and risk in automation scripts.

f = None

try:
    f = open("my_log.txt", "w")
    f.write("First line\n")
    # Simulate an error
    result = 1 / 0
    f.write("Second line\n")
except:
    print("Error has occurred.")
finally:
    if f:
        print("Closing file.")
        f.close()

print(f"File closed: {f.closed}")

The `with` Statement Simplifies Cleanup

The with statement handles setup and teardown automatically for context managers.
For file I/O, with open(...) as f: guarantees f.close() on block exit, even if an exception is raised.
Syntax is concise and idiomatic, reducing boilerplate and improving readability.

Common Context Manager Examples

Files: with open(...) as f: for automatic file closing.
Locks: with threading.Lock(): acquires and releases locks safely.
Tempfiles/Dirs: with tempfile.TemporaryDirectory() as d: creates and cleans up temporary directories.
Context managers from the standard library cover most resource-management needs.

f = None

try:
    with open("my_log.txt", "w") as f:
        f.write("First line\n")
        # Simulate an error
        result = 1 / 0
        f.write("Second line\n")
except:
    print("Error has occurred.")

print(f"File closed: {f.closed}")

import tempfile, os

dir_name = None

with tempfile.TemporaryDirectory() as tempdir:
    print(f"Created temp dir: {tempdir}")

    dir_name = tempdir
    test_file = os.path.join(tempdir, "test.txt")

    with open(test_file, "w") as file:
        file.write("Hello from temp directory.")

    print(f"Files inside temp dir: {os.listdir(tempdir)}")

try:
    contents = os.listdir(dir_name)
    print(f"Contents of {dir_name}: {contents}")
except FileNotFoundError as e:
    print(f"Expected error accessing removed directory: {e}")

Custom Resource Management: Writing Context Managers

Whenever you need custom setup/teardown logic, you can write your own Context Manager.
A context manager ensures that teardown always runs, even if errors occur in the block.
Two approaches: implement __enter__/__exit__ in a class or use the simpler generator-based decorator.

class MyContextManager:
    def __init__(self, timeout):
        self.timeout = timeout

    def __enter__(self):
        print("Setup complete")
        return "a simple value"

    def __exit__(self, exception_type, exception_value, traceback):
        print(f"Teardown")

        # Commenting out since we replaced *args for explicit
        # exception_type, exception_value, traceback parameters

        # for arg in args:
        #     print(arg)

        return False

with MyContextManager(timeout=30) as cm:
    print(cm)
    print("Inside the block")
    raise ValueError("Simulated problem")

The `@contextlib.contextmanager` Decorator

Provided by the contextlib module to turn a generator into a context manager.
Decorated function needs exactly one yield.
Code before yield runs as __enter__; code after (or in finally) runs as __exit__.
Simplifies many common patterns without writing a full class.

Generator Structure for `@contextmanager`

Wrap the yield in try...finally to ensure teardown even on errors.
The value yielded is bound to as var in the with statement (if used).
You can catch exceptions inside the generator if you want to suppress them.

import os
from contextlib import contextmanager

@contextmanager
def change_directory(destination):
    """
    Temporarily switch into destination. If the directory does not exist,
    it is created just before the switch.

    Args:
        destination (str): Path to the directory that should become the working directory
    """

    origin_dir = os.getcwd()

    try:
        print(f"Changing into {destination}")
        os.makedirs(destination, exist_ok=True)
        os.chdir(destination)
        yield os.getcwd()
    finally:
        print(f"Reverting to original dir: {origin_dir}")
        os.chdir(origin_dir)

print(f"Start: {os.getcwd()}")

with change_directory("temp_dir") as new_dir:
    print(f"Inside: {new_dir}")

print(f"End: {os.getcwd()}")

Custom Exceptions: Tailoring Error Signals/shaare/hTTdng

python

Custom Exceptions: Tailoring Error Signals

Built-in exceptions are great, but often too generic for application-specific failures.
A custom exception like ServiceConnectionError immediately conveys context compared to a plain Exception.
Defining a base exception class groups related errors; subclasses add specificity for targeted handling.
Catching except BaseError: handles all related issues, while except SpecificError: addresses one case precisely.

Simple Custom Exceptions (Inheritance)

Create a new exception by subclassing Exception or another exception class.
Using pass is enough when no extra logic or attributes are needed.
Catch the base class (AutomationError) to handle any related subclass errors in one block.
Use subclasses (FileProcessingError, APICallError) when context-specific handling is required.

class AutomationError(Exception):
    """Base for all automation script errors."""
    pass

class FileProcessingError(AutomationError):
    """Error during file processing stage."""
    pass

class APICallError(AutomationError):
    """Error during an external API call."""
    pass

def process_file(filepath):
    raise FileProcessingError(f"Failed to process file at path: {filepath}")

try:
    process_file("nonexistent.csv")
except FileProcessingError as e:
    print(f"File error: {e}")
except AutomationError:
    print("Other automation error occurred.")

Adding Context with `init`

Override __init__ in your exception class to capture context (e.g., filename, invalid value).
Store custom attributes on self and build a clear message passed to super().__init__().
Inherit from a built-in exception (ValueError) when semantics align, allowing broad catches.
Attribute access (e.key_name) provides extra debugging info in handlers.

class ConfigValueError(ValueError):
    """Raised when a config value is invalid."""
    def __init__(self, key_name, invalid_value, message="Invalid configuration value."):
        self.key_name = key_name
        self.invalid_value = invalid_value
        full_message = f"{message} for key '{key_name}': received '{invalid_value}'"
        super().__init__(full_message)

try:
    raise ConfigValueError("timeout", -5, message="Timeout cannot be negative")
except ConfigValueError as e:
    print(f"{e}")
    print(f"   -> key: {e.key_name}")
    print(f"   -> value: {e.invalid_value}")

Raising and Catching Enhanced Custom Exceptions

Raise custom exceptions by instantiating them with relevant arguments: raise MyError(arg1, arg2).
In except blocks, catch specific exceptions and access their attributes for tailored recovery or logging.
Fallback except BaseError: catches any related subclass if no more specific handler exists.

class DeploymentError(Exception):
    """Base class for deployment-related errors."""
    pass

class InvalidEnvironmentError(DeploymentError):
    """Raised when environment is invalid."""
    def __init__(self, env_name, allowed_envs):
        self.env_name = env_name
        self.allowed_envs = allowed_envs
        super().__init__(f"Invalid environment '{env_name}'. Allowed values: {allowed_envs}")

class PackageMissingError(DeploymentError):
    """Raised when required packages are missing."""
    def __init__(self, package_name, host):
        self.package_name = package_name
        self.host = host
        super().__init__(f"Package '{package_name}' is missing on host {host}.")

def deploy_app(environment, package):
    allowed_envs = ["staging", "production"]

    if environment not in allowed_envs:
        raise InvalidEnvironmentError(environment, allowed_envs)

    if environment == "production" and package == "critical-lib":
        raise PackageMissingError(package, f"server-{environment}")

    print(f"Deployment to {environment} with package {package} succeeded.")

for env, pkg in [("dev", "tool"), ("production", "critical-lib"), ("staging", "tool")]:
    try:
        deploy_app(env, pkg)
    except DeploymentError as e:
        print(e)

Signaling Errors: The raise Statement/shaare/ul6CvA

python

Signaling Errors: The `raise` Statement

Functions sometimes encounter states they cannot handle and must signal failure clearly.
Using raise triggers an exception, integrates with try...except, and stops execution immediately.
Prefer exceptions over special return values (None, False) to avoid ambiguous error handling.
Raising early enforces preconditions and supports the "fail fast" principle.

def process_servers(server_list):
    if not isinstance(server_list, list):
        # return None - BAD Practice, better to raise TypeError Exception
        raise TypeError("Input 'server_list' must be of type list.")

    # GOOD practice - Handle edge cases without raising exception
    if len(server_list) == 0:
        print("There are no servers to process. Exiting...")
        return

    print(f"Processing {len(server_list)} servers.")

# process_servers("abc") # Uncommenting will raise TypeError
process_servers([])
process_servers(["web01", "web02"])

Raising Built-in Exceptions

Built-in exception classes (e.g., TypeError, ValueError, FileNotFoundError) convey standard error semantics.
Raise TypeError when the argument’s type is wrong; raise ValueError when its value is out of acceptable range.
Use exceptions like OSError, ConnectionError, etc., when the built-in meaning matches your context.
Always include a clear, informative message describing the failure.

def set_deployment_replicas(count):
    """Example: enforce input type and value boundaries with built-in Exceptions."""
    try:
        parsed_count = int(count)
    except (ValueError, TypeError):
        raise TypeError(f"Replica count must be int or convertible to int, got {type(count).__name__}")

    if parsed_count < 0 or parsed_count > 100:
        raise ValueError(f"Replica count must be between 0 and 100")

    print(f"Replicas set to {parsed_count}")

for val in [5, -2, "three", 150, "5", 5.0]:
    try:
        set_deployment_replicas(val)
    except (TypeError, ValueError) as e:
        print(f"Caught error: {e}.")

Exceptions/shaare/_Jauhg

python

Common Built‑in Exceptions

Python ships with a rich hierarchy of exception classes; most automation errors fall into a small, predictable subset.
All ordinary run‑time exceptions inherit from Exception, but subclasses convey why something failed (e.g., file missing vs. wrong type).
Catching overly broad bases like Exception hides root causes and can mask bugs—prefer the narrowest class you can handle.
Understanding the inheritance tree lets you decide when a single except can cover many related problems (e.g., OSError).

import inspect, builtins

def show_tree(base, level=0, max_depth=1):
    if level > max_depth:
        return

    for name, obj in vars(builtins).items():
        if inspect.isclass(obj) and issubclass(obj, base) and obj is not base:
            print("\t" * level + f"- {name}")
            show_tree(obj, level + 1, max_depth)

show_tree(Exception, max_depth=1)

`OSError` Family: Filesystem & Network Issues

Signals problems interacting with the operating system: files, permissions, sockets, paths.
Subclasses such as FileNotFoundError, PermissionError, IsADirectoryError, ConnectionRefusedError, and TimeoutError offer granularity.
Catch individual subclasses when you can recover differently (create a missing file, prompt for sudo, retry a connection).
A single except OSError still groups all OS‑level failures when the response is the same (e.g., log and abort).

try:
    with open('nonexistent.txt', 'r') as file:
        content = file.read()
except FileNotFoundError as e:
    print("File not found")
except PermissionError:
    print("Permission denied when accessing resources.")
except OSError as os_err:
    print(f"General OS error: {os_err}")

`KeyError`: Missing Dictionary Keys

Raised when using dict[key] with a key that is absent.
Frequent in config loading, JSON parsing, or environment variable maps.
Mitigation patterns: dict.get(key, default), membership tests (if key in cfg), or a tailored except KeyError.
Treats missing data distinctly from a wrong value (ValueError) or wrong type (TypeError).

config = {"host": "server1", "port": 8080}
config2 = {"host": "server1", "port": 8080, "api-key": "12345"}

api_key = config.get("api-key", "")
print(f"API Key: {api_key}")

def call_endpoint(config, endpoint):
    """Calls the specified endpoint of the configured host.

    Args:
        config (dict(str)): Dict containing host, port, and api-key
        endpoint (str): The endpoint to hit
    """
    if "api-key" in config:
        print(f"Making API call to endpoint {endpoint} with key: {config["api-key"]}")
    else:
        print("No api-key available, not possible to call API.")

def call_endpoint_exception(config, endpoint):
    """Calls the specified endpoint of the configured host.

    Args:
        config (dict(str)): Dict containing host, port, and api-key
        endpoint (str): The endpoint to hit
    """
    try:
        print(f"Making API call to endpoint {endpoint} with key: {config["api-key"]}")
    except KeyError as missing_key:
        print(f"No required key {missing_key} available, not possible to call API.")

call_endpoint(config, "/users")
call_endpoint(config2, "/users")

call_endpoint_exception(config, "/users")
call_endpoint_exception(config2, "/users")

`IndexError`: Sequence Index Out of Bounds

Triggered when list/tuple indices fall outside the valid range: negative beyond the left edge or ≥ len(seq).
Common during iterative processing of dynamic lists or user‑provided indexes.
Prevent with bounds checks (if i < len(seq)), safe iteration (for item in seq:), or catch and default.
Signals "wrong position" rather than "wrong content".

servers = ["web01", "web02"]

i = 2

if i < len(servers):
    print(servers[i])

try:
    print(servers[i])
except IndexError as e:
    print(f"Index error: {e}. List length is {len(servers)}")

`ValueError` vs. `TypeError`

ValueError: argument type is acceptable but content/value is invalid (e.g., int("abc")).
TypeError: operation applied to an object of the wrong type altogether (e.g., len(5) or "a" + 3).
Distinguishing them clarifies whether to validate content or convert types.
Catch them separately to craft precise user feedback.

try:
    port = int("http") # ValueError, since literal must be a valid base-10 value
except ValueError as e:
    print(f"Bad numeric string: {e}")

try:
    total = "Errors: " + 5
except TypeError as e:
    print(f"Type mismatch: {e}")

`AttributeError`: Missing Object Member

Raised when an attribute or method doesn't exist on the object referenced.
Often results from typos, unexpected None, or polymorphic functions returning different types.
Defensive techniques: hasattr(obj, "attr"), if obj is not None:, or narrow except AttributeError.
Conveys "object of this type doesn’t support that capability".

class Calculator:
    def add(self, a, b):
        return a + b

calc = Calculator()

if hasattr(calc, "subtract"):
    print(calc.subtract(10, 5))
else:
    print("Object has no attribute 'subtract'")

try:
    print(calc.subtract(10, 5))
except AttributeError as e:
    print(f"AttributeError caught: {e}.")

try:
    print(calc.result)
except AttributeError as e:
    print(f"AttributeError caught: {e}.")

`ImportError` / `ModuleNotFoundError`

Raised when an import statement cannot locate a module/package.
ModuleNotFoundError (Python 3.6+) is the specific subclass; catching ImportError also covers it.
Causes: misspelling, package not installed, wrong virtual environment, or PYTHONPATH issues.
Typical handling logs instructions and aborts early to avoid cascading failures.

try:
    import non_existent_lib
except ModuleNotFoundError as e:
    print(f"Import failed: {e}. Is the library installed and the correct venv active?")

Enhancing Functions: Decorators/shaare/OPgX0g

python

Enhancing Functions: Decorators

A decorator is a callable that takes another function, adds behaviour before and/or after it runs, and returns a new callable.
They solve cross‑cutting concerns such as logging, timing, permission checks, or retries without cluttering core logic.
The magic @decorator_name syntax is shorthand for passing the target function to the decorator and re‑binding the original name to the returned wrapper.

Decorator Anatomy (Manual View)

Outer decorator function accepts the target function and creates a wrapper inside it.
The wrapper usually takes *args, **kwargs so it can handle any signature.
Wrapper executes optional "before" code, calls the original, maybe does "after" code, and returns the original’s result.
Returning the wrapper from the decorator completes the transformation.

Using decorators:

Manually wrapping illustrates what @ syntax really does behind the scenes.
This approach is clear but repetitive: @ eliminates the manual reassignment step.

import time

def simple_task(sleep_duration):
    time.sleep(sleep_duration)
    print("Running a simple task...")

def timing_decorator(original_function):
    def wrapper(*args, **kwargs):
        start = time.perf_counter()
        result = original_function(*args, **kwargs)
        duration = time.perf_counter() - start
        print(f"{original_function.__name__} took {duration:.3f}s")

        return result

    return wrapper

simple_task = timing_decorator(simple_task)
simple_task(0.3)

The `@` Syntax

Placing @decorator_name directly above def my_func(): triggers my_func = decorator_name(my_func) at definition time.
After that line is executed, my_func refers to the wrapper returned by the decorator, so callers automatically get enhanced behaviour.
This keeps the decoration visible and close to the function definition, improving readability.

@timing_decorator
def another_task():
    print("Running another task...")

another_task()

Configurable Decorators: Decorators with Arguments

A basic decorator adds fixed behavior; sometimes you need to configure that behaviour (e.g. how many retries, which log level).
You cannot pass options directly to a plain @decorator, because that decorator receives only the target function.
Solution: call a factory that takes options and returns a decorator, then apply it with @factory(option=value).

def timing_decorator(original_function):
    def wrapper(*args, **kwargs):
        start = time.perf_counter()
        result = original_function(*args, **kwargs)
        duration = time.perf_counter() - start
        print(f"{original_function.__name__} took {duration:.3f}s")

        return result

    return wrapper

The Decorator Factory Pattern

Factory function receives configuration arguments and returns the actual decorator.
The actual decorator still takes the target function and builds a wrapper.
The wrapper can access both the factory’s configuration (via a closure) and the call‑time *args / **kwargs for the target function.
Three nested layers keep concerns separated: configuration ➜ decoration ➜ runtime.

Applying Decorators with Arguments

Use @factory(arg1, arg2…) above the function definition.
At definition time Python calls the factory, gets back a decorator, and applies that decorator to the function.
Callers of the function automatically get the behaviour configured by the factory.

Example: Retry Decorator Factory

A practical DevOps scenario: retry a flaky operation a configurable number of times.
The factory takes max_attempts; the wrapper loops until success or until attempts are exhausted, re‑raising the last error.

import random

def retry(max_attempts=3):
    def decorator(func):
        def wrapper(*args, **kwargs):
            for attempt in range(1, max_attempts + 1):
                try:
                    print(f"Attempt {attempt}/{max_attempts}")
                    return func(*args, **kwargs)
                except Exception as e:
                    print(f" Error: {e}")
                    if attempt == max_attempts:
                        raise

        return wrapper
    return decorator

@retry(4)
def sometimes_fails():
    if random.random() < 0.7:
        raise RuntimeError("Flaky failure")
    return "Success!"

print(f"Result: {sometimes_fails()}")

Configurable Decorators: Decorators with Arguments

A basic decorator adds fixed behavior; sometimes you need to configure that behaviour (e.g. how many retries, which log level).
You cannot pass options directly to a plain @decorator, because that decorator receives only the target function.
Solution: call a factory that takes options and returns a decorator, then apply it with @factory(option=value).

def timing_decorator(original_function):
    def wrapper(*args, **kwargs):
        start = time.perf_counter()
        result = original_function(*args, **kwargs)
        duration = time.perf_counter() - start
        print(f"{original_function.__name__} took {duration:.3f}s")

        return result

    return wrapper

The Decorator Factory Pattern

Factory function receives configuration arguments and returns the actual decorator.
The actual decorator still takes the target function and builds a wrapper.
The wrapper can access both the factory’s configuration (via a closure) and the call‑time *args / **kwargs for the target function.
Three nested layers keep concerns separated: configuration ➜ decoration ➜ runtime.

Applying Decorators with Arguments

Use @factory(arg1, arg2…) above the function definition.
At definition time Python calls the factory, gets back a decorator, and applies that decorator to the function.
Callers of the function automatically get the behaviour configured by the factory.

Example: Retry Decorator Factory

A practical DevOps scenario: retry a flaky operation a configurable number of times.
The factory takes max_attempts; the wrapper loops until success or until attempts are exhausted, re‑raising the last error.

import random

def retry(max_attempts=3):
    def decorator(func):
        def wrapper(*args, **kwargs):
            for attempt in range(1, max_attempts + 1):
                try:
                    print(f"Attempt {attempt}/{max_attempts}")
                    return func(*args, **kwargs)
                except Exception as e:
                    print(f" Error: {e}")
                    if attempt == max_attempts:
                        raise

        return wrapper
    return decorator

@retry(4)
def sometimes_fails():
    if random.random() < 0.7:
        raise RuntimeError("Flaky failure")
    return "Success!"

print(f"Result: {sometimes_fails()}")

Decorators & Return Values

A decorator’s wrapper replaces the original function, so if it forgets to return the original result the caller receives None.
Many real‑world functions produce critical data (e.g. status strings, dictionaries, numeric results); the decorator must be transparent about that value.
Fixing this means capturing the result of func(*args, **kwargs) inside the wrapper and returning it unchanged.

def log_calls_broken(func):
    def wrapper(*args, **kwargs):
        print(f"LOG: Calling {func.__name__}")
        func(*args, **kwargs)
        print(f"LOG: Finished {func.__name__}")
    return wrapper

@log_calls_broken
def add(x, y):
    return x + y

print(f"Result seen by caller: {add(2, 3)}")

The Wrapper’s Responsibility

The wrapper is the public face of the decorated function; it must faithfully:
- Call the original with all arguments.
- Capture its return value.
- Perform any extra behaviour (log, time, validate).
- Return the captured value so callers remain unaware of the wrapper.
Failure to return breaks contracts and causes subtle bugs.

Capturing return values

Capturing is a one‑liner: value = func(*args, **kwargs).
After post‑call logic, return value preserves behaviour.
You can also inspect or transform value before returning if the decorator’s purpose demands it.

def log_calls(func):
    def wrapper(*args, **kwargs):
        print(f"LOG: Calling {func.__name__}")
        value = func(*args, **kwargs)
        print(f"LOG: Finished {func.__name__}")
        return value
    return wrapper

@log_calls
def multiply(a, b):
    return a * b

print(f"Result seen by caller: {multiply(2, 3)}")

Handling Exceptions in Decorators

Wrappers often log exceptions for observability but should re‑raise them so callers can still handle or see errors.
Use try ... except ... raise around the call; log inside the except, then re‑raise without arguments to preserve traceback.
A decorator that swallows exceptions changes program semantics unless that is its explicit purpose (e.g. retry).

def log_and_reraise(func):
    def wrapper(*args, **kwargs):
        try:
            return func(*args, **kwargs)
        except Exception as err:
            print(f"[ERROR] {func.__name__} raised {err.__class__.__name__}")
            raise
    return wrapper

@log_and_reraise
def fail():
    raise ValueError("simulated problem")

fail()

`functools.wraps`

A decorator replaces the original function object with its wrapper, so introspection tools see the wrapper’s metadata instead of the original’s.
Attributes such as __name__, __doc__, __module__, and type‑hint annotations are lost or altered.
This confuses debuggers, documentation generators, and anyone relying on help(), inspect, or error traces that reference the function name.
Python’s functools module supplies @wraps(original_func); apply it inside your decorator to the wrapper.
@wraps copies key metadata from the original function onto the wrapper, so the decorated function still looks like the original externally.

def broken_decorator(func):
    def wrapper(*args, **kwargs):
        return func(*args, **kwargs)
    return wrapper

@broken_decorator
def add(a, b):
    """Return the sum of two numbers."""
    return a + b

print("Introspection without @wraps:")
print(f"  __name__: {add.__name__}")
print(f"  __doc__: {add.__doc__}")

from functools import wraps

def correct_decorator(func):
    @wraps(func) # Best practice: Always use it!
    def wrapper(*args, **kwargs):
        return func(*args, **kwargs)
    return wrapper

@correct_decorator
def multiply(a, b):
    """Return the product of two numbers."""
    return a * b

print("Introspection with @wraps:")
print(f"  __name__: {multiply.__name__}")
print(f"  __doc__: {multiply.__doc__}")

Stacking Decorators: Applying Multiple Layers

Python lets you attach more than one decorator to a single function by writing multiple @decorator lines above the def.
Each decorator contributes a distinct slice of behaviour (logging, timing, caching, auth checks) keeping the core function clean.

Application vs. Execution Order

Decoration happens bottom‑up when the function is defined:
1. Decorator nearest the def wraps the original first.
2. Each line above wraps the result of the previous decoration.
Execution happens top‑down (outside‑in) when the decorated function is called: the outermost wrapper runs first, then calls the inner wrapper, and so on until the original function runs.

Order Matters

Swapping decorator order changes both side‑effects and final result if wrappers transform the return value.


from functools import wraps

def decorator_A(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
        print("A before")
        result = func(*args, **kwargs)
        print("A after")
        return result
    return wrapper

def decorator_B(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
        print("B before")
        result = func(*args, **kwargs)
        print("B after")
        return result
    return wrapper

@decorator_A
@decorator_B
def foo():
    print("  >>> inside function foo")

@decorator_B
@decorator_A
def bar():
    print("  >>> inside function bar")

foo()

print("----")

bar()

Mode	Reads?	Writes?	Creates if missing?	Truncates on open?
r	✅	❌	❌	❌
r+	✅	✅	❌	❌
w	❌	✅	✅	✅
w+	✅	✅	✅	✅
a	❌	✅	✅	❌
a+	✅	✅	✅	❌

Mode	Reads?	Writes?	Creates if missing?	Truncates on open?
r	✅	❌	❌	❌
r+	✅	✅	❌	❌
w	❌	✅	✅	✅
w+	✅	✅	✅	✅
a	❌	✅	✅	❌
a+	✅	✅	✅	❌

Running External Commands with subprocess.run

Why subprocess? The Old Ways

The subprocess.run() Function

Basic Command Execution

Common Pitfalls & How to Avoid Them

Temporary Files and Directories

Why Use the tempfile Module?

tempfile.TemporaryFile()

tempfile.NamedTemporaryFile()

tempfile.TemporaryDirectory()

Common Pitfalls & How to Avoid Them

Filesystem Operations (os & shutil)

Listing Directory Contents

Creating Directories

Removing Files and Directories

Copying Files and Directories

Moving Files and Directories

Common Pitfalls & How to Avoid Them

Working with Environment Variables

Accessing Environment Variables with os.getenv()

Accessing Environment Variables with os.environ

Setting Environment Variables Within Python

Using dotenv to Manage Local Environment Files

Common Pitfalls & How to Avoid Them

Working with CSV files

CSV Format Basics

Reading CSV files with csv.reader

Reading with csv.DictReader

Writing with csv.writer

Writing with csv.DictWriter

Working with YAML files

YAML Syntax and Features

Deserializing YAML with yaml.safe_load

Serializing Python Objects with yaml.dump

Working with JSON files

JSON Syntax and Python Mapping

Deserializing JSON

Parsing JSON Files

Serializing Python objects to JSON Strings

Serializing Python objects to JSON Files

Regex Essentials: Overview

Introduction to re.search() vs re.match()

Common Metacharacters

Special Sequences

Quantifier Cheat-Sheet

Quantifiers & Greedy vs Non-Greedy

Capturing Groups and Back-References

Capturing Groups

Named Capturing Groups

Non-Capturing Groups

Back-references

Search, Split, and Substitute

Finding All Matches

Splitting Strings

Substituting Text

Read/Write Text Files

File Modes

Understanding +

Reading Text Files

Writing Text Files

Working with Filesystem Paths in Python

Limitations of String Paths and os.path

Creating and Combining Path Objects

Inspecting Path Properties

Listing Directory Contents

Reading and Writing Files with Path

Declarative Logging Configuration

INI-Style Configuration with fileConfig

Dictionary-Based Configuration with dictConfig

Loading Configuration from JSON or YAML

Dynamic and Programmatic Adjustments

Introduction to Structured Logging

Configuring python-json-logger

Logging with Extra Context

Logging Exceptions as JSON

Logging to Files

Basic File Logging with FileHandler

Size-Based Rotation with RotatingFileHandler

Time-Based Rotation with TimedRotatingFileHandler

Log Levels in Practice

Running External Commands with `subprocess.run`

Why `subprocess`? The Old Ways

Accessing Environment Variables with `os.getenv()`

Accessing Environment Variables with `os.environ`

Reading CSV files with `csv.reader`

Reading with `csv.DictReader`

Writing with `csv.writer`

Writing with `csv.DictWriter`

Deserializing YAML with `yaml.safe_load`

Serializing Python Objects with `yaml.dump`

Introduction to `re.search()` vs `re.match()`

Understanding `+`

Limitations of String Paths and `os.path`

Creating and Combining `Path` Objects

Reading and Writing Files with `Path`

INI-Style Configuration with `fileConfig`

Dictionary-Based Configuration with `dictConfig`

Configuring `python-json-logger`

Basic File Logging with `FileHandler`

Size-Based Rotation with `RotatingFileHandler`

Time-Based Rotation with `TimedRotatingFileHandler`

The `with` Statement Simplifies Cleanup

The `@contextlib.contextmanager` Decorator

Generator Structure for `@contextmanager`

Adding Context with `init`

Signaling Errors: The `raise` Statement

`OSError` Family: Filesystem & Network Issues

`KeyError`: Missing Dictionary Keys

`IndexError`: Sequence Index Out of Bounds

`ValueError` vs. `TypeError`

`AttributeError`: Missing Object Member

`ImportError` / `ModuleNotFoundError`

The `@` Syntax

`functools.wraps`

Mode	Reads?	Writes?	Creates if missing?	Truncates on open?
r	✅	❌	❌	❌
r+	✅	✅	❌	❌
w	❌	✅	✅	✅
w+	✅	✅	✅	✅
a	❌	✅	✅	❌
a+	✅	✅	✅	❌