AI-powered sustainability report generation using template-driven content hydration, Workiva data retrieval, and conversational editing via Microsoft 365 Copilot.
The Retrieval PoC enables analysts to generate complete sustainability report sections through a conversational interface. Using a declarative template language, the system dynamically populates content by combining LLM-generated narratives with live data from Workiva factbooks.
LaTeX templates with embedded tags for dynamic content injection
Direct cell references hydrate templates with factbook data
Natural language commands via Microsoft 365 Copilot chat
The template language uses a tag-based syntax embedded in LaTeX documents. Tags are resolved at generation time, with content either retrieved from Workiva or generated by the LLM based on specification files.
| Tag Pattern | Type | Description |
|---|---|---|
{@content:section-id} |
LLM Generated | Generates content using the spec file at /specs/section-id.yaml |
{@content:section-id.subsection} |
LLM Generated | Nested content generation with hierarchical spec lookup |
{$workiva:sheet_name:cell_ref} |
Data Reference | Retrieves live value from Workiva spreadsheet cell |
{$workiva:sheet_name:range} |
Data Reference | Retrieves a range of cells (e.g., A1:D10) |
{#if condition}...{/if} |
Conditional | Conditionally include content based on Workiva values |
{#each range as item}...{/each} |
Iterator | Iterate over Workiva rows to generate repeated sections |
% VSME Environmental Section - Generated Template \section{Environmental Disclosures} % Pull company name from Workiva factbook \subsection{About {$workiva:Factbook:B2}} % LLM-generated introduction based on spec file {@content:env-intro} % Climate transition plan status from factbook \subsection{Climate Transition Status} {#if {$workiva:C3:G4} == "TRUE"} The undertaking operates in high-impact sectors and has adopted the following transition measures: {@content:transition-plan.adopted} {/if} {#if {$workiva:C3:G4} == "FALSE"} {@content:transition-plan.not-applicable} {/if} % KPI display with live data \begin{kpicard} \kpivalue{{$workiva:Metrics:C12}} \kpilabel{GHG Emissions Reduction} \kpichange{{$workiva:Metrics:D12} YoY} \end{kpicard} % Pollutant emissions table - iterate over data rows \subsection{Pollutant Emissions} {#each {$workiva:B4:B5:B23} as row} \emissionrow{{row.pollutant}}{{row.air}}{{row.water}}{{row.soil}} {/each}
Each {@content:id} tag is backed by a YAML specification file that defines the generation
parameters. Specs include style guidelines, VSME framework references, tone, and context.
id: env-intro type: narrative vsme_reference: B1-B7 paragraph_guidance: "29-56" style: tone: professional, formal voice: third-person length: 150-250 words brand_guidelines: ./brand/kpmg-style.yaml context: company_name: {$workiva:Factbook:B2} reporting_year: {$workiva:Factbook:B5} sector: {$workiva:Factbook:B8} prompt: | Write an introductory paragraph for the Environmental section of a VSME sustainability report. The paragraph should: - Introduce the company's environmental commitment - Reference the reporting period and framework (EU VSME) - Set the context for detailed disclosures that follow - Maintain consistency with prior year reports if available cache: enabled: true invalidate_on: - workiva:Factbook:B2 # Company name change - workiva:Factbook:B5 # Reporting year change
specs/ ├── env-intro.yaml # Environmental introduction ├── transition-plan/ │ ├── adopted.yaml # For companies with plans │ └── not-applicable.yaml # For non-applicable cases ├── emissions/ │ ├── scope1.yaml │ ├── scope2.yaml │ └── scope3.yaml ├── pollutants/ │ └── summary.yaml └── brand/ └── kpmg-style.yaml # Shared style guidelines templates/ ├── environmental-section.tex ├── social-section.tex └── governance-section.tex cache/ └── content_cache.json # Generated content cache
The system reads directly from Workiva factbooks using the existing API integration. Cell references follow a simple addressing scheme that maps to the VSME framework structure.
| Reference | Returns | Example |
|---|---|---|
{$workiva:C3:G4} |
Single cell value | "TRUE" |
{$workiva:C3:G4:G10} |
Range (array) | ["TRUE", "Not adopted...", ...] |
{$workiva:B4:B5:B23} |
Row iterator | Pollutant data rows for {#each} |
{$workiva:Factbook:company_name} |
Named field | Uses mapper.yaml field definitions |
Analyst requests section via Copilot
System loads LaTeX template
Parse and identify all tags
Retrieve Workiva cell values
LLM generates narrative
Build PDF output
The primary interface for document generation is the Microsoft 365 Copilot chat. Analysts use natural language commands to generate, edit, and refine report sections.
Analyst: Generate the environmental section for the 2024 report Copilot: I'll generate the Environmental section using the current Workiva factbook data. Let me fetch the latest values... Reading from Workiva: - Company: Acme Industries Ltd - Reporting Period: Jan 1 - Dec 31, 2024 - High Impact Sector: Yes - Transition Plan Status: Adopted with targets Generating content... Here's your draft Environmental section. Key highlights: - Introduction paragraph (150 words) - Climate transition status with targets - 3 KPI cards with live metrics - Pollutant emissions table (12 rows) [Preview PDF] [Edit in Word] [Approve & Save] --- Analyst: The tone in the introduction is too informal. Make it more formal. Copilot: I'll regenerate the introduction with a more formal tone while keeping the same factual content. Updated introduction: "Acme Industries Ltd hereby presents its environmental disclosures in accordance with the EU Voluntary SME Sustainability Reporting Standard (VSME) for the fiscal year ended December 31, 2024..." [Accept Changes] [Try Another Version] --- Analyst: Add a section about our water usage from the B4 data Copilot: I'll add a water usage subsection. Let me check what data is available in the B4 (Pollutants) section of your factbook... Found water-related data: - B4-6 through B4-12: Water pollutant emissions - Metrics available for Nitrogen, Phosphorus, BOD [Generate Water Section] [Show Available Data]
| Command Pattern | Action |
|---|---|
"Generate [section] for [year]" |
Creates full section from template with current Workiva data |
"Show me the data for [framework code]" |
Displays relevant Workiva factbook values |
"Regenerate [content block]" |
Forces fresh LLM generation, ignoring cache |
"Make it more [tone]" |
Adjusts style/tone of most recent generated content |
"Add [topic] from [data source]" |
Incorporates additional Workiva data into section |
"Preview as PDF" |
Compiles current draft to PDF for review |
The PoC builds on the existing document generation prototype at
experiments/test_doc_gen.py. Key components to extend:
app/api/routes/demo.py)
POST /api/generate/section Body: { template: "environmental", year: 2024 } Returns: { pdf_url, word_url, content_hash } POST /api/generate/regenerate Body: { content_id: "env-intro", force: true } Returns: { new_content, previous_version } GET /api/workiva/cells/{sheet}/{range} Returns: { cells: [...], last_updated } POST /api/preview/pdf Body: { template, data_overrides } Returns: { pdf_url, warnings }
This specification outlines Phase 4 capabilities. Implementation follows the existing architecture patterns.