AI-Assisted Analysis Validation Log

This document records AI-generated insights from the original Project 2 analysis alongside their validation status. Each insight was tested against the actual complaints.csv dataset using Python and/or SQL. This process demonstrates the importance of validating AI-generated claims before including them in stakeholder reports.

Dataset: 542 CFPB complaints, North Carolina, Credit Card Fees & Interest, Jan 2024 – Apr 2026 Validation method: Python (pandas) and Athena-compatible SQL queries


Validation Summary

# AI-Generated Insight Status Evidence
1 Synchrony Financial is the top complaint source ✅ Accepted 123 complaints (22.7%)
2 “Problem with fees” dominates sub-issues ✅ Accepted 355 complaints (65.5%)
3 Companies respond timely nearly 100% of the time ✅ Accepted 99.6% timely rate
4 Most complaints are closed without monetary relief ✅ Accepted 63.3% closed with explanation only
5 Complaint volume spikes in early calendar year months ⚠️ Partially Accepted Jan 2025 (27) and Jan 2026 (27) are elevated, but not consistently the highest months
6 “Promotional rate” is a common narrative theme ❌ Rejected 0 occurrences in 337 narratives
7 Consumers frequently mention “APR” in narratives ❌ Rejected Only 14 occurrences (4.2% of narratives)
8 Monetary relief is rare — under 10% of complaints ❌ Rejected 169 complaints (31.2%) received monetary relief

Detailed Validation Records


Insight 1: Synchrony Financial is the top complaint source

AI-generated claim: “Synchrony Financial generates the most consumer complaints in this dataset.”

Validation method: Python value_counts() on the Company column.

Result:

SYNCHRONY FINANCIAL                    123
CITIBANK, N.A.                          71
CAPITAL ONE FINANCIAL CORPORATION       67
Bread Financial Holdings, Inc.          67

Status: ✅ Accepted Synchrony Financial leads with 123 complaints, more than 1.7× the next closest company. The claim is accurate.


Insight 2: “Problem with fees” dominates sub-issues

AI-generated claim: “Fee-related complaints are the most common sub-issue by a wide margin.”

Validation method: Python value_counts() on the Sub-issue column.

Result:

Problem with fees                        355  (65.5%)
Charged too much interest                148  (27.3%)
Unexpected increase in interest rate      39   (7.2%)

Status: ✅ Accepted “Problem with fees” accounts for nearly two-thirds of all complaints. The claim is accurate.


Insight 3: Companies respond timely nearly 100% of the time

AI-generated claim: “Financial institutions in this dataset are highly compliant with CFPB response deadlines.”

Validation method: Python boolean count on Timely response? column.

Result:

Yes: 540 (99.6%)
No:    2  (0.4%)

Status: ✅ Accepted The 99.6% timely response rate confirms the claim. Note: timely response measures deadline compliance, not consumer satisfaction.


Insight 4: Most complaints are closed without monetary relief

AI-generated claim: “The majority of complaints are resolved with an explanation rather than financial compensation.”

Validation method: Python value_counts() on Company response to consumer.

Result:

Closed with explanation          343  (63.3%)
Closed with monetary relief      169  (31.2%)
Closed with non-monetary relief   20   (3.7%)
In progress                       10   (1.8%)

Status: ✅ Accepted 63.3% of complaints are closed with explanation only. The claim is accurate.


Insight 5: Complaint volume spikes in early calendar year months

AI-generated claim: “Complaints tend to increase at the start of the year, possibly tied to year-end billing cycles.”

Validation method: Python monthly groupby on Date received.

Result (January months):

2024-01: 18
2025-01: 27
2026-01: 27

Highest months overall:

2026-03: 38  (highest)
2025-06: 25
2025-01: 27
2026-01: 27

Status: ⚠️ Partially Accepted January months do show elevated volumes in 2025 and 2026, but the single highest month is March 2026 (38 complaints). The seasonal pattern is suggestive but not conclusive with only 28 months of data. The insight is directionally reasonable but should not be stated as a confirmed pattern.


Insight 6: “Promotional rate” is a common narrative theme

AI-generated claim: “Consumers frequently mention promotional rates in their complaint narratives.”

Validation method: Python string search across 337 narratives for the term “promotional rate”.

Result:

"promotional rate": 0 occurrences

Status: ❌ Rejected Zero narratives mention “promotional rate.” This term does not appear in the dataset. The insight was likely generated based on general knowledge of credit card complaints rather than this specific dataset.


Insight 7: Consumers frequently mention “APR” in narratives

AI-generated claim: “APR is a commonly referenced term in consumer complaint narratives.”

Validation method: Python string search across 337 narratives for the term “apr” (case-insensitive).

Result:

"apr": 14 occurrences (4.2% of narratives with text)

Status: ❌ Rejected “APR” appears in only 14 of 337 narratives (4.2%). Consumers in this dataset tend to use plain language (“fee”, “interest”, “late fee”) rather than financial terminology. The claim overstates the frequency of this term.


Insight 8: Monetary relief is rare — under 10% of complaints

AI-generated claim: “Very few consumers receive monetary relief from their complaints.”

Validation method: Python value_counts() on Company response to consumer.

Result:

Closed with monetary relief: 169 (31.2%)

Status: ❌ Rejected 31.2% of complaints resulted in monetary relief — nearly one in three. This is not “rare” by any reasonable definition. The AI-generated claim significantly underestimated the monetary relief rate. This is a meaningful finding: credit card fee and interest complaints in North Carolina have a relatively high monetary relief rate compared to other complaint categories.


Key Takeaway

This validation exercise demonstrates that AI-generated insights can be directionally useful but require verification against actual data before inclusion in any report or dashboard. Three of eight insights were rejected, and one required qualification. The validation process is a core component of responsible data analytics practice.