Back to blog
The Receipts are Fake. Verifiability is Real.
Six months ago, our co-founder Kyle experimented with an AI-generated video dataset including timestamps, device ids, face embeddings, and more, creating images that matched one of the rows in the dataset. These images appeared real, but were fake, and along with the dataset, were entirely AI-generated. The reality that anything could be easily faked was here, and it was alarming. At Pinata, we’d been discussing and experimenting with verifiability use cases for AI for months. In fact, verifiability is what Pinata was built for back in 2018. Six months later, the need is more evident than ever with the data to back it up (no pun intended). This week, several articles were posted about AI-generated fake receipts being used by employees in workplace expenses. In one article by Cryptopolitan, AppZen attributed 14 percent of fraudulent expenses submitted in September to be AI-generated. That was up from 0 percent last year. Detection software from Ramp found over $1 million in fake invoices in 90 days (Cryptopolitan).
More Than Just Receipts
AI-generated fake receipts are just the beginning. Fake invoices, resumes, identification docs, certifications, contracts, and even legal evidence can be AI-generated, and it will only become harder to determine authenticity. Financial Times, TechRadar, and The Decoder all reported similar cases of employees using generative AI to fake receipts and expense reports. 41% of fraud targeting organizations is powered by AI, and $414,000 was the annual loss, according to Fingerprint. Over a third of respondents in their report saw losses of $1 million (Fingerprint). No company wants to lose any amount of money to fraud.
AI-generated content has made it frictionless for someone to create a fake document or data. For companies, this creates an outsized risk of financial loss, data breaches, and compliance risks. For individuals, there is risk of identity theft, manipulated data, and scams. We’ve seen this at scale in the current job market with deepfake candidates and fake jobs that have conned individuals out of personal data and money. AI makes deception inexpensive, fast, and scalable, so when systems rely on trust, they’re exposed.
AI vs. AI in the Fraud Detection Arms Race
The speed of the evolution of generative AI tools has made fraudulent images and text nearly undetectable without metadata. Thankfully, many expense and finance tools are already using AI for fraud detection by scanning patterns, metadata, and anomalies to identify fraud. While AI fraud detection works, it’s reactive. Generative AI is proactive, and AI fraud detection is trying to keep up. Just as quickly as AI models are built to catch fraud, another can be trained to evade it. Instead of detection after the fact, we need verifiability at the source.
Paired with AI, IPFS is the solution to the problem.
Enter IPFS and Pinata
IPFS (InterPlanetary File System) provides the fingerprint for data. When you upload a file like a photo, PDF, or dataset to Pinata, it is automatically assigned a unique Content Identifier (CID). If anything in the file changes, the CID changes. The CID is your “proof” that the file is the original. Pinata makes it easy for you to upload and manage files on IPFS.
Using CIDs, someone can verify authenticity and detect tampering, all based on the string of characters that identifies the original file. When it comes to AI-generated fake receipts, CIDs can stop AI-generated or altered documents from passing through because the CID wouldn’t match the original upload. CIDs would also prevent duplicate submissions because identical files would share the same CID. Complementing AI fraud detection tools that flag suspicious patterns, IPFS adds cryptographic proof that a file hasn’t been tampered with. Using Pinata’s API, CID verification can be integrated into existing workflows. Verifiability will be critical for everything from journalism to research to datasets, including data used to train AI models.
Truth in Data Matters (and Pays)
- Financial Integrity: When data is verifiable, companies can avoid processing fake documents, whether that’s a receipt or a resume.
- Trust at Scale: CIDs let people and organizations trust the data, not just the sender. This is important for numerous industries like finance, research, media, and AI, where proof of origin matters.
- Audit Efficiency: Verifiable data means faster approvals, transparent records, and less manual verification, which can save time and money.
- Reputation and Resilience: Having cryptographic proof of authenticity builds trust in an age of AI-generated fakes.
Verifiable content storage isn’t just for crypto, it’s for anyone who needs truth in data. Start with Pinata.