PDF 转 Visual Struct

概览

PDF to Visual Struct 把一页 PDF 转换为 Codia Visual Element Schema:一棵强类型的层级 JSON 元素树,带有边界框、布局配置与样式规格。该 Schema 与 Codia Studio 同源,下游消费方(Figma 导入器、代码生成器、视觉 QA 流水线)可以基于同一套稳定数据结构工作。

端点

POSThttps://openapi.codia.ai/v2/open/pdf_to_design

通过 Bearer Token 鉴权。在 codia.ai/dashboard/developer 获取 Key。

请求

请求为 multipart/form-data。

字段	类型	必填	说明
`pdf_file`	file	是	待转换的 PDF 文件。每次请求单文件。
`page_no`	string	是	从 0 开始的页码。多页在客户端并发循环。

示例

bash

curl 'https://openapi.codia.ai/v2/open/pdf_to_design' \
  -H 'Authorization: Bearer {codia_api_key}' \
  -H 'Content-Type: application/json' \
  --form 'pdf_file=@"xx.pdf"' \
  --form 'page_no="0"'

响应

json

{
  "configuration": {
    "scalingFactor": 1,
    "baseWidth": 1940,
    "measurementUnit": "px"
  },
  "pages": [
    {
      "visualElement": { /* 根元素 */ }
    }
  ],
  "size": {
    "height": 1080,
    "width": 1940
  }
}

顶层字段

字段	说明
`configuration.baseWidth`	布局求解时使用的参考宽度。在不同视口宽度渲染时按比例缩放边界框。
`configuration.measurementUnit`	坐标单位,通常为 `px`。
`configuration.scalingFactor`	预应用缩放比。通常为 `1`。
`size.width`, `size.height`	基础坐标系下求解的页面画布。
`pages[].visualElement`	页面树根节点。沿 `childElements` 递归。

Visual Element Schema

每个节点结构相同:

json

{
  "elementId": "pdf_page_1",
  "elementName": "PDF Document Page",
  "elementType": "Panel",
  "displayName": "First Page",
  "displayOrder": 0,
  "boundingBox": [0, 0, 595, 842],
  "layoutConfig": {
    "positionMode": "Normal",
    "flexibleMode": "Absolute"
  },
  "styleConfig": {
    "widthSpec":  { "sizing": "FIXED", "value": 595 },
    "heightSpec": { "sizing": "FIXED", "value": 842 },
    "backgroundSpec": {
      "type": "COLOR",
      "backgroundColor": { "rgbValues": [255, 255, 255] }
    }
  },
  "processingMeta": {
    "surfaceArea": 501490,
    "detectionScore": 0.92,
    "textContainerized": false
  },
  "childElements": []
}

字段参考

字段	说明
`elementId`	在本次响应内稳定的标识符;跨页不保证全局唯一。
`elementType`	强类型元素类:`Panel`、`Text`、`Image`、`Icon`、`Button`、`Table`、`Chart` 以及 50+ 其他类型。
`boundingBox`	基础坐标下的 `[x, y, width, height]`。
`layoutConfig.positionMode`	`Normal`(流式)或 `Absolute`。
`layoutConfig.flexibleMode`	`Row`、`Column` 或 `Absolute`,对齐 Figma auto-layout。
`styleConfig.widthSpec.sizing`	`FIXED`、`FILL_CONTAINER` 或 `HUG_CONTENT`。
`styleConfig.backgroundSpec`	背景:颜色、渐变或图像。
`processingMeta.detectionScore`	检测置信度,0.0 – 1.0。下游使用前过滤低置信节点。
`childElements`	子元素数组,深度遍历得到完整层级。

限制

项	默认值
单文件大小上限	50 MB
单文档页数上限	100
每分钟请求上限	取决于套餐,详见定价。
单页典型延迟	600 ms – 2 s

支持的输入

文本型 PDF(设计工具、办公软件或程序生成)——保真度最高。
扫描 PDF —— 自动 OCR;低分辨率扫描件噪声较多。
不支持带密码保护的 PDF,请先解锁。

常见模式

多页转换

在套餐并发上限内,从 0 到总页数并发循环:

const pages = await Promise.all(
  pageNumbers.map((n) => fetch(ENDPOINT, buildRequest(file, n))),
)

过滤低置信节点

function prune(node, minScore = 0.6) {
  node.childElements = (node.childElements || [])
    .filter((c) => c.processingMeta.detectionScore >= minScore)
    .map((c) => prune(c, minScore))
  return node
}

回写 Figma

由于 layoutConfig 对齐 Figma auto-layout,可直接作为 auto-layout 框架导入。不想自建导入器,使用 PDF to Design Figma 插件。

常见问题

能一次转换整份文档吗?

不能——端点按页工作。在客户端循环并并发。

转换有多准?

实际识别质量取决于 PDF 来源、文本清晰度和结构复杂度。扫描法律表单与手写输入更低,务必查看 processingMeta.detectionScore。

字体会保留吗?

字体名、字号、字重、颜色都保留在 styleConfig 中。渲染时需要在消费端安装原字体。

有 SDK 吗?

不需要——API 是纯 HTTP + JSON。官方 SDK 在路线图上。

能私有化部署吗?

企业方案支持。请联系 [email protected]。

下一步

Visual Struct —— 同 Schema,图像输入。
Remove BG —— 生成透明背景图,便于合成渲染输出。
完整端点参考:/api#convert-pdf-to-design。