Skip to content

Python

Regex To Match Upper Case Words

Here's a short demonstration of how to use a regular expression to identify UPPERCASE words in a bunch of text files.

The goal in this particular code snip is to open and read all of the .rtf files in a given directory and identify only the UPPERCASE words appearing in the file.

import os
import re

directory = '/path/to/files'
regex = r"\b[A-Z][A-Z]+\b"

for filename in os.listdir(directory):
    if filename.endswith(".rtf"):
        with open(filename, 'r') as f:
            transcript = f.read()
            matches = re.finditer(regex, transcript)
            for match in matches:
                print (match[0])