Back to blog

Using x402 to Monetize AI Hardware

Using x402 to Monetize AI Hardware

Steve

Decentralized AI has been a hot topic over the last year or two. Our usage and dependence on AI increases each day, which also means it increases our reliance on central providers. Thankfully there are more and more local AI models that are proving to be more capable. The only problem you may encounter with these local models is your own hardware. AI computation takes not only a considerable amount of memory (RAM), but also processing power through GPUs. GPUs are generally expensive, so in order to really use a local model efficiently you need expensive hardware. Blockchain has tried to solve this problem by creating token gated networks of nodes that handle AI tasks, but it usually ends up being far too complicated. What if it doesn’t have to be?

The release of x402 by Coinbase is something we’ve been experimenting with heavily, and recently we asked ourselves “what if we could have pay to use AI?” That’s exactly what we built, and we even used it to write a blog post which you can read here. What really makes this whole experiment powerful is the compatibility of x402 with our AI standards such as Ollama and the OpenAI SDK for client requests. In this post, we’ll do a brief overview of both the server/hardware setup and how clients can pay USDC on Base to access those machines, resulting in a simplified and truly distributed approach to AI.

Server Setup

In our experiment, we have a Jetson Orin Nano living on my desk in the hills of Tennessee (we have great internet believe it or not). Setting it up to be a pay-per-use AI server was actually quite simple. First we needed to install Ollama, a framework for downloading and running AI models. Pretty much everything you need is already provided, all you have to do is install the framework, download some models with a command like ollama pull llama3.2, then make sure the the daemon is running with systemctl status ollama if running Linux. This will provide an OpenAI compatible API at http://localhost:11434/v1. With this local access to the AI models, we can spin up a simple Hono instance to proxy the request with x402.

import { Hono } from 'hono'
import { paymentMiddleware } from "x402-hono"

const app = new Hono().basePath('/v1')

app.use(paymentMiddleware(
  "0xmywalletaddress",
  {
    "/v1/chat/completions": {
      price: "$0.10",
      network: "base-sepolia",
      config: {
        description: "Access to AI Prompts",
      }
    }
  },
  {
    url: "<https://x402.org/facilitator>", // Facilitator URL for Base Sepolia testnet.
  }
));

app.get('/', (c) => {
  return c.text('Pay to Prompt')
})

app.post("/chat/completions", async (c) => {
  const body = await c.req.json()
  const ollamaRes = await fetch('<http://localhost:11434/v1/chat/completions>', {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json'
    },
    body: JSON.stringify(body)
  });

  const ollamaData = await ollamaRes.json()
  return c.json(ollamaData)
})

export default app

This will provide almost an identical API endpoint that the OpenAI SDK expects, except that it is now behind an x402 paywall where the user needs to pay $0.10 in USDC to get the data. Of course, you could even make this payment rate dynamic just like we did it with pay to pin. Having this setup as a server, which can be accessed externally, is also relatively easy. First we just setup a service to run in the background through systemd by creating a ollama-x402.service file under /etc/systemd/system with the following contents:

[Unit]
Description=X402 Ollama Application
After=network.target
Wants=network.target

[Service]
Type=simple
User=steve
Group=steve
WorkingDirectory=/home/your-user/x402-ollama
ExecStart=/home/your-user/.bun/bin/bun run --hot /home/your-user/x402-ollama/src/in
dex.ts
Restart=always
RestartSec=5
StandardOutput=journal
StandardError=journal
Environment=NODE_ENV=production

[Install]
WantedBy=multi-user.target

In this setup, we’re using Bun as our runtime and serving our Hono server at port 3000. Combine this with Cloudflare Tunnels and you have a secure custom domain assigned to your machine and the app running out of the designated port! We did exactly this so we could make API request to a URL like this:

<https://x402.mydomain.com/v1/chat/completions>

That’s really all it takes to monetize AI prompts!

Client Setup

Now to the fun part: using the OpenAI SDK to make requests to our server. Since x402 works without any need for API keys, all you need is either x402-fetch or x402-axios to help with making the API calls and settling the payments with a local wallet. Here’s an example of the code we used to make the blog post mentioned earlier.

import { wrapFetchWithPayment } from "x402-fetch";
import { privateKeyToAccount } from "viem/accounts";
import OpenAI from "openai";
const account = privateKeyToAccount(process.env.PRIVATE_KEY as `0xstring`);
const fetchWithPayment = wrapFetchWithPayment(fetch, account);

const chat = async () => {
  try {
    const client = new OpenAI({
      apiKey: "ollama",
      baseURL: "<https://x402.mydomain.com/v1>",
      fetch: fetchWithPayment,
    });

    const completion = await client.chat.completions.create({
      model: "llama3.2",
      messages: [
        {
          role: "user",
          content: `Write a blog post about how using crypto middleware to support payments for 
          inference requests to locally running AI models on someone's machine is the future of p2p AI.`,
        },
      ],
    });
  } catch (error) {
    throw error;
  }
};

(async () => {
  try {
    await chat();
  } catch (error) {
    console.log(error);
    process.exit(1);
  }
})();

The real magic here is that the OpenAI instance allows a custom fetch property where we can pass in fetchWithPayment and settle transactions with our x402 server. With this one simple script, and a local PRIVATE_KEYwith some funds, we can pay $0.10 per prompt and get AI generated content back. It’s seriously that easy, and provides the experience all developers expect with hosted AI services.

If you’re interested in seeing the code for our server, then check out the repo with the link below!

GitHub - PinataCloud/jetson-x402
Contribute to PinataCloud/jetson-x402 development by creating an account on GitHub.

Wrapping Up

While our Jetson Nano with 8GB of RAM may not be the powerhouse most developers need, there are far better hardware options out there that people could monetize. More importantly, this demonstrates how x402 can turn any computer into a node that runs specialized software that can instantly be monetized, with little to no effort at all. Pinata will continue to experiment as we believe in a future that revolves around content on IPFS, blockchain, and AI intersecting in ways we haven’t seen yet.

Happy Pinning!

Subscribe to paid plan image

Share this post:

Stay up to date

Join our newsletter for the latest stories & product updates from the Pinata community.