Back to blog

How to Build Docs for LLMs

How to Build Docs for LLMs

Steve

AI is changing the way developers build applications and the tools they use on a daily basis. However, if there’s one thing we all know, the landscape and tools used to build apps are evolving faster than LLMs can keep up. You might be building an app with Next.js v.14 but your LLM only goes up to v.13, and not to mention v.15 just came out! In order to keep LLMs and AI useful, we have to provide context to the tools being used on the front lines of app building. This is why, after seeing some inspiration around the web, we built Pinata AI Docs; a resource for LLMs and AI empowered text editors and chat apps. The response to these tools has been so positive we thought we would provide a quick tutorial on how you can build your own AI Docs!

Building the Text

The most important part of creating AI docs for your platform is the text itself. This is the crucial information that an LLM is going to ingest as context to provide assistance to the end user, so it has to be accurate but also succinct. There’s a few different ways to create this, so we’ll go over a few methods with some pros and cons.

Using AI

I suppose it shouldn’t come as a surprise that perhaps the easiest way to make AI docs is to use AI! This is the path we used when building our own AI docs, so we’ll go into some details on the steps taken. One piece that helped a bunch is that our docs are already opened sourced, which includes markdown files and OpenAPI specs for our APIs. To start building the AI docs, we used Zed’s context feature and give it context to the SDK folder and the OpenAPI spec, then we asked it to write up shortened docs specifically designed for LLMs that contained all the necessary pieces, such as methods, endpoints, parameters, responses, etc.. Once it generates your initial doc, then the fun and tedious work begins. In our opinion, this method is most effective when you hand review what the LLM is producing and tweak it. You might have to prompt a few more times, adjusting until you get the docs where you want them. Personally it only took us about 20 minutes to have a finished product!

Using External Tools

This approach of having AI docs has been taking off quite a bit, to the point that there was a proposal for /llms.txt standard that helps define how these succinct docs could look like. Mintlify, one of the primary docs providers out there, has actually built this into all doc sites for their users by default. You can test it yourself by visiting https://docs.pinata.cloud/llms.txt. The only con to this approach is that the initial /llms.txt file is going to be an index of the docs, not the content itself. This means the LLM has to also crawl and visit the pages inside, and depending on your tool, it may not fulfill all the requirements. There’s also /llms-full.txt that will provide the docs in it’s entirety, but at that point it’s far too much context for most LLMs to handle. Another tool that could be used is Gitingest which takes in a repo from GitHub and uses LLMs to generate text. The benefit with this approach is you can limit the file size ingested, but then again that means you might be missing out on some context.

Building a Client

After you have the main text for your AI Docs, you’re going to want an easy way for them to be ingested by your users. What can make this tricky is the fact that different tools have different methods of getting context. Cursor can take a URL and use it as a doc, while Zed or Claude don’t do any fetching and need the docs to be added manually. To handle these different cases we built a simple Astro site where the user can copy the text directly or access links to tutorials on how to use the AI Docs for their tool. Under the hood, we also used some extra utilities to check if the request for the root of the website is a programmatic request or not.

interface RequestHeaders {
	get(name: string): string | null;
}

export interface RequestLike {
	headers: RequestHeaders;
}

export function isProgrammaticRequest(request: RequestLike): boolean {
	const userAgent = request.headers.get("user-agent")?.toLowerCase() || "";
	const acceptHeader = request.headers.get("accept")?.toLowerCase() || "";

	const programmaticIndicators = {
		commandLineTools: [
			"curl",
			"wget",
			"postman",
			"insomnia",
			"python",
			"node",
			"axios",
			"got",
			"fetch",
			"apache",
			"bot",
		],

		bots: ["googlebot", "bingbot", "baiduspider", "yandexbot", "duckduckbot"],

		browsers: ["mozilla", "chrome", "safari", "firefox", "edge", "opera"],
	};

	const isCommandLineTool = programmaticIndicators.commandLineTools.some(
		(tool) => userAgent.includes(tool),
	);

	const isBot = programmaticIndicators.bots.some((bot) =>
		userAgent.includes(bot),
	);

	const isBrowser = programmaticIndicators.browsers.some((browser) =>
		userAgent.includes(browser),
	);

	const isAcceptPlainText =
		acceptHeader.includes("text/plain") || acceptHeader === "*/*";

	const isProgrammaticHeaders =
		request.headers.get("sec-fetch-mode") === "programmatic" ||
		request.headers.get("x-requested-with") === "XMLHttpRequest";

	return (
		isCommandLineTool ||
		isBot ||
		(!isBrowser && isAcceptPlainText) ||
		isProgrammaticHeaders
	);
}

With this function, we can determine if the root of the website should return the HTML with all the helpful links and UI, or if it should just return the text.

---
import Layout from "../layouts/Layout.astro";
import { getFilesContent } from "../utils/content";
import { isProgrammaticRequest } from "../utils/requestHelpers";
import { Image } from "astro:assets";
import logo from "../assets/logo.png";

const { request } = Astro;

if (isProgrammaticRequest(request)) {
	const text = await getFilesContent();
	return new Response(text, {
		headers: {
			"Content-Type": "text/plain",
			"Cache-Control": "no-store",
		},
	});
}

const content = await getFilesContent();
---

// Rest of the UI rendering `content`

To have a better look or use it for yourself, check out the open source repo for this client!

Wrapping Up

It’s hard to say if AI is a bubble that will eventually pop, but the help your users can get from AI is undeniable. I’m sure we’ll see more services and APIs build AI doc sites like these in the near future, so it helps to get a head start and build it yourself!

If you want to go a step further you could follow this tutorial on how to vectorize your docs and build a chat app like we did with Pinata Docs Chat, giving your users another level of interaction and reducing the friction for them to get answers. We can’t wait to see what you build!

Subscribe to paid plan image

Share this post:

Stay up to date

Join our newsletter for the latest stories & product updates from the Pinata community.