An llms.txt file is easy to publish. The harder question is whether it is doing anything useful.
That distinction matters. A file can exist at example.com/llms.txt and still be badly structured, blocked by your server, filled with weak links, ignored by AI crawlers, or too vague to help an AI system understand your site.
Testing it properly means checking three things: whether it is accessible, whether it is technically valid, and whether it actually improves the answers AI tools give about your business, product, or content.
What an llms.txt file is supposed to do
An llms.txt file is a plain Markdown file placed at the root of your website. Its job is to give large language models a clean, structured guide to your most important content.
Think of it less like a ranking signal and more like an AI-readable content map. It should tell an AI system:
- what your site is about;
- which pages matter most;
- how your content is organised;
- where to find authoritative product, service, documentation, policy, or editorial pages;
- which URLs should be used as trusted sources.
A strong llms.txt file does not try to list every page on your site. It curates. It points AI systems towards the pages that best explain who you are, what you offer, and what information should be used when answering questions about you.
First, test whether the file is live
Start with the basic access test.
Open:
https://yourdomain.com/llms.txt
You should see a raw Markdown file, not a styled webpage, not a redirect chain, not a 404, and not a CMS error page.
Check the following:
- The file loads over HTTPS.
- It returns a
200 OKstatus. - It is publicly accessible.
- It is not blocked behind a login, firewall, cookie banner, or geo-restriction.
- It is not accidentally redirected to your homepage.
- It uses clean Markdown formatting.
- It is available at the root of the domain, not buried somewhere obscure.
A simple browser check is not enough. Use a header checker, crawl tool, or command line request to confirm the server response.
For example:
curl -I https://yourdomain.com/llms.txt
You want to see something like:
HTTP/2 200
content-type: text/plain; charset=utf-8
If the file returns a 403, 404, 500, or redirect loop, it is not working properly.
Check that search engines and AI crawlers are not blocked
Next, inspect your robots.txt file.
Your llms.txt file is not the same as robots.txt. The two files do different jobs. robots.txt tells crawlers what they may or may not access. llms.txt tells AI systems which content is useful to understand.
Still, your crawler rules can accidentally undermine your llms.txt setup.
Check:
https://yourdomain.com/robots.txt
Look for rules that block the file itself, key documentation pages, blog sections, product pages, or Markdown versions of your content.
For example, this would be a problem:
User-agent: *
Disallow: /llms.txt
So would this, if your llms.txt links mostly point to URLs under /docs/:
User-agent: *
Disallow: /docs/
Also check whether your CDN, firewall, bot protection, or rate-limiting rules are blocking common AI crawlers. Some sites publish an llms.txt file but then prevent automated systems from fetching the URLs listed inside it.
That is like putting up a signpost and locking every door it points to.
Validate the structure of the file
A useful llms.txt file should be simple. Do not over-engineer it.
At minimum, it should include:
# Site or Brand Name
> A short, clear description of what the site does.
## Key Pages
- [About](https://example.com/about): Overview of the company, audience, and purpose.
- [Products](https://example.com/products): Main product range and use cases.
- [Pricing](https://example.com/pricing): Current pricing and plan information.
- [Support](https://example.com/support): Help centre and contact options.
The structure should be easy for both humans and machines to parse.
Check for these issues:
- Missing H1 title.
- No short summary under the title.
- Too many links.
- Vague section headings.
- Links without descriptions.
- Broken URLs.
- Redirecting URLs.
- URLs blocked by
robots.txt. - Links to thin, outdated, duplicate, or low-value pages.
- Promotional language instead of factual summaries.
- Inconsistent terminology.
- Long walls of text.
- JavaScript-dependent pages that do not render useful text.
A good test: read only your llms.txt file and ask whether a person could understand your site from it in 60 seconds. If not, an AI system may struggle too.
Crawl every URL listed in the file
Your llms.txt file is only as good as the pages it points to.
Export all URLs from the file and crawl them with a tool such as Screaming Frog, Sitebulb, a custom script, or your CMS’s link checker.
For every linked page, check:
- status code;
- indexability;
- canonical tag;
- redirect status;
- page title;
- meta description;
- word count;
- internal links;
- last modified date;
- whether the page contains the information promised in the
llms.txtdescription.
If your file says:
- [Returns Policy](https://example.com/returns): Full refund, exchange, and warranty rules.
Then that page should actually contain the full refund, exchange, and warranty rules. Do not make the AI infer details from half a paragraph.
Test whether AI tools can use it
The most practical test is also the simplest: give the file to an AI tool and see what happens.
Try prompts like:
Read this llms.txt file and tell me what this website is about:
[paste file content]
Then ask:
Based only on this llms.txt file, what are the most important pages on this site?
And:
What questions could you confidently answer about this business from this file?
The answers should be specific. If the AI gives you generic output, your file is probably too thin or too vague.
Bad result:
This website provides services to customers and has helpful resources.
Better result:
This website sells accounting software for small businesses, with separate pages for pricing, payroll features, integrations, customer support, and migration from spreadsheets.
That level of specificity is what you are aiming for.
Test with real customer questions
Do not only test whether the file looks valid. Test whether it helps answer the questions people actually ask.
Create a short list of prompts based on your real search queries, sales calls, support tickets, or customer objections.
Examples:
What does [brand] do?
Does [brand] offer enterprise pricing?
How does [brand] compare with [competitor]?
What is [brand]’s refund policy?
Which [brand] product is best for beginners?
Run each prompt twice:
- without providing the
llms.txtfile; - with the
llms.txtfile and its linked pages.
Compare the answers.
Look for:
- fewer hallucinations;
- more accurate summaries;
- better citations or source selection;
- clearer product descriptions;
- fewer outdated claims;
- stronger alignment with your preferred terminology;
- better handling of pricing, policies, and technical details.
If the answers do not improve, your file may be pointing to the wrong pages — or the underlying pages may need work.
Check your server logs
Server logs are the closest thing to a reality check.
Look for requests to:
/llms.txt
/llms-full.txt
/.well-known/llms.txt
/.well-known/llms-full.txt
Then check which user agents are requesting them.
You may see visits from AI crawlers, SEO tools, uptime monitors, security scanners, or nothing at all. That is useful information either way.
Track:
- how often the file is requested;
- which bots request it;
- whether they also fetch the pages listed inside it;
- whether requests return
200,304,403,404, or5xx; - whether bot protection is interfering;
- whether requests spike after you update the file.
A file that is never requested may still be useful when manually supplied to AI tools, but you should not assume it is being widely consumed by major AI systems.
Compare llms.txt and llms-full.txt
Some sites use both:
llms.txtas a concise index;llms-full.txtas a fuller Markdown version of key site content.
They serve different purposes.
Your llms.txt file should be short, selective, and navigational. Your llms-full.txt file can be longer and more comprehensive, especially for documentation, technical references, policies, product guides, or developer resources.
Test both files separately.
For llms.txt, ask:
- Is this the best possible map of the site?
- Are the most important pages listed first?
- Are the descriptions useful?
- Is anything missing?
- Is anything included that should not be?
For llms-full.txt, ask:
- Is the content current?
- Is it too long to be useful?
- Are headings clean?
- Are duplicate sections removed?
- Are policies, pricing, and technical details accurate?
- Does it include outdated copy from old pages?
More content is not always better. A bloated llms-full.txt file can make it harder for an AI system to find the right answer.
Optimise the file for clarity, not hype
The best llms.txt files are boring in the right way. They are clear, factual, and well organised.
Avoid language like:
- [Solutions](https://example.com/solutions): Discover our world-class, cutting-edge, innovative solutions designed to transform your business.
Use language like:
- [Solutions](https://example.com/solutions): Overview of CRM, email marketing, and automation tools for small businesses.
AI systems do not need sales fluff. They need clean context.
Strong descriptions usually include:
- the topic of the page;
- the audience;
- the type of information available;
- any important constraints, such as region, product line, pricing model, or date sensitivity.
For example:
- [Business Insurance Guide](https://example.com/business-insurance): Explains public liability, professional indemnity, and workers compensation insurance for Australian small businesses.
That is much more useful than:
- [Business Insurance Guide](https://example.com/business-insurance): Learn everything you need to know.
Prioritise your most authoritative pages
Your llms.txt file should not be a dumping ground for every URL you want crawled.
Prioritise pages that are:
- accurate;
- stable;
- comprehensive;
- internally approved;
- frequently cited;
- useful for answering common questions;
- representative of your products, services, expertise, or policies.
For most businesses, that means including:
- homepage;
- about page;
- main product or service pages;
- pricing page;
- documentation or help centre;
- contact page;
- policies;
- high-quality guides;
- comparison pages;
- original research;
- glossary or explainer content;
- API or developer documentation, where relevant.
Avoid linking to:
- tag archives;
- thin blog posts;
- outdated announcements;
- campaign landing pages;
- duplicate category pages;
- search result pages;
- pages with expired offers;
- pages that contradict newer content.
The goal is not to maximise volume. The goal is to reduce confusion.
Keep it updated
An outdated llms.txt file can become a liability.
Review it whenever you:
- launch a new product;
- change pricing;
- update policies;
- rebrand;
- migrate URLs;
- restructure documentation;
- remove a service;
- publish major research;
- change your positioning.
At minimum, review it quarterly. For fast-moving sites, review it monthly or automate the update process through your CMS or documentation platform.
The key is to avoid stale links and stale descriptions. If your pricing page changed six months ago but your llms.txt file still points to an old plan structure, you are feeding AI systems the wrong context.
Measure performance carefully
You can measure whether your llms.txt file is technically working. Measuring whether it improves AI visibility is harder.
Useful signals include:
- server log requests to
/llms.txt; - requests to linked Markdown pages;
- AI crawler activity before and after publishing;
- brand mentions in AI search tools;
- accuracy of AI-generated answers about your brand;
- referral traffic from AI platforms;
- support queries caused by incorrect AI answers;
- changes in how often your content is cited by AI tools.
But be careful. Correlation is not causation.
If AI referrals increase after publishing an llms.txt file, the file may have helped. Or the increase may have come from stronger content, better brand demand, broader crawl activity, PR, or unrelated changes in AI search systems.
Treat llms.txt as infrastructure, not magic.
A simple testing checklist
Use this checklist when auditing your file:
[ ] File exists at /llms.txt
[ ] File returns 200 OK
[ ] File is publicly accessible
[ ] File is served as plain text or Markdown
[ ] File has a clear H1 title
[ ] File has a short summary
[ ] Sections are clearly organised
[ ] Important pages are included
[ ] Unimportant pages are excluded
[ ] Every URL works
[ ] No linked URL is blocked
[ ] Descriptions are specific
[ ] Content is factual, not promotional
[ ] Pricing and policy links are current
[ ] Documentation links are current
[ ] Server logs show whether bots request the file
[ ] AI prompt tests produce better answers with the file than without it
[ ] Review schedule is in place
The bottom line
A working llms.txt file is not just one that loads in a browser. It should be accessible, well structured, crawlable, current, and genuinely useful when an AI system tries to understand your site.
The best test is not whether the file exists. It is whether it helps produce clearer, more accurate answers about your business.
If your file gives AI systems a clean map to your most authoritative content, it is doing its job. If it is just a list of random URLs wrapped in Markdown, it is probably not worth much yet.
Build it like a source guide. Test it like a technical asset. Maintain it like part of your content infrastructure.
