Content Inventory

• A Content Inventory catalogs all existing content on a site: every page, document, image, and media file with key metadata. • It's the essential first step before any IA redesign, migration, or content strategy initiative. • Inventories reveal the true scope of content — teams are consistently shocked by how much content they actually have.

stellae.design

Information Architecture

A Content Inventory is a comprehensive catalog of all content assets in a digital product, typically organized in a spreadsheet with columns for URL, page title, content type, owner, last updated date, traffic data, and quality assessment. Content inventories serve as the foundation for content audits (qualitative assessment of content), migration planning, IA restructuring, and governance initiatives. Paula Ladenburg Land and Kristina Halvorson ('Content Strategy for the Web') established content inventories as a standard practice. While labor-intensive, inventories provide the factual foundation that prevents IA decisions from being made on assumptions about what content exists.

Why It Matters

A content inventory is a comprehensive, systematic catalog of every piece of content in a digital product or website — documenting what exists, where it lives, who owns it, when it was last updated, and what condition it is in — providing the factual foundation that all content strategy decisions must be built on, because you cannot improve, reorganize, or govern content you have not first identified and assessed. Without a content inventory, teams make content strategy decisions based on assumptions and incomplete knowledge — redesigning a site without knowing that 40% of its pages have not been updated in three years, migrating to a new CMS without knowing the full scope of content types and metadata structures, or launching a new information architecture without knowing how much content needs to be remapped. The content inventory transforms content management from a subjective, opinion-driven activity into an evidence-based discipline, enabling teams to identify redundancies, gaps, outdated material, and governance failures with data rather than guesswork.

Real-World Examples

BBC's content audit during responsive redesign

When the BBC undertook its massive responsive web redesign, the team first conducted a comprehensive content inventory of their sprawling digital estate — cataloging thousands of pages, identifying content types, mapping ownership to editorial teams, and assessing the condition and relevance of every piece of content before making any design or architecture decisions. This inventory revealed that a significant percentage of their content had not been updated in years, that multiple editorial teams had created overlapping content on the same topics, and that the actual content structure differed dramatically from the assumed structure the existing IA was built on. The inventory data drove decisions about what to migrate, what to consolidate, and what to retire, saving the project from the common failure mode of redesigning the container without understanding the contents.

Content inventory driving GatherContent's migration tool

GatherContent built their entire content operations platform around the principle that content inventories should be living documents rather than one-time audits — their tool automatically inventories content across sources, tracks status through editorial workflows, and maintains a real-time view of content health metrics across the organization. This approach solved the central problem with traditional content inventories: they are enormously time-consuming to create and immediately begin decaying, so teams conduct them once and then never again. By making the inventory a continuous, automated process integrated into the content workflow, GatherContent enabled organizations to make ongoing content strategy decisions based on current data rather than a snapshot that was accurate six months ago.

Counter-example

University website redesign without content inventory

A large university launched a website redesign by designing a new visual theme and information architecture based on stakeholder interviews and competitor analysis — but without first inventorying the existing site's content, which comprised over fifteen thousand pages created by hundreds of department contributors over fifteen years. During migration, the team discovered content types the new templates could not accommodate, regulatory content that could not be restructured without legal review, and thousands of pages with external inbound links that would break if URLs changed — problems that an inventory would have identified months earlier. The project went 200% over budget and timeline because the team essentially had to conduct the content inventory during migration, the most expensive and disruptive possible moment to discover the actual scope and complexity of their content.

Role-Specific Guidance

Common Mistakes

• The most expensive mistake is skipping the content inventory entirely and proceeding with content strategy, migration, or redesign based on assumptions about what content exists — teams consistently underestimate both the volume and complexity of their content, and discovering the true scope mid-project causes budget overruns, timeline slips, and architectural compromises that a pre-project inventory would have prevented. Another frequent error is conducting a content inventory as a one-time exercise that produces a static spreadsheet, which begins decaying immediately as content is created and modified — sustainable content governance requires an automated, continuously updated inventory integrated into the content management workflow. Teams also commonly inventory content properties (URL, title, date) without assessing content quality (accuracy, relevance, readability, accessibility), producing a catalog that tells you where everything is but not whether any of it is good — and quality assessment is where the inventory generates its most actionable insights.