Schema Metadata
Define tables, dimensions, metrics, and relationships for better SQL.
Basic Operations
Create table schema
schema = client.schema_metadata.create(
project_id=project.id,
name="Users Table",
schema_data={
"table": {
"name": "users",
"columns": [
{"name": "id", "type": "INTEGER"},
{"name": "email", "type": "VARCHAR(255)"}
]
}
},
)
Create dimension:
dimension = client.schema_metadata.create(
project_id=project.id,
name="User Status",
schema_data={
"table": {
"dimension": {
"name": "status",
"content": {"type": "categorical", "values": ["active","inactive"]}
}
}
},
)
Create metric:
metric = client.schema_metadata.create(
project_id=project.id,
name="Total Revenue",
schema_data={
"table": {
"metric": {
"name": "total_revenue",
"content": {"aggregation": "sum", "column": "amount"}
}
}
},
)
Create relationship:
relationship = client.schema_metadata.create(
project_id=project.id,
name="User Orders Relationship",
schema_data={
"relationship": {
"from_table": "users",
"to_table": "orders",
"from_column": "id",
"to_column": "user_id",
"type": "one_to_many"
}
},
)
List/get/update/delete:
schemas = client.schema_metadata.list(project_id=project.id)
one = client.schema_metadata.get(project_id=project.id, schema_metadata_id=schema.id)
updated = client.schema_metadata.update(project_id=project.id, schema_metadata_id=schema.id, description="Updated")
client.schema_metadata.delete(project_id=project.id, schema_metadata_id=schema.id)
Filter by type
tables = client.schema_metadata.list_by_type(project.id, "table")
Validation helpers
errors = client.schema_metadata.validate_schema({"table": {"name": "users", "columns": []}}, "table")
stype = client.schema_metadata.get_schema_type({"table": {"name": "users", "columns": []}})
Working with Large Tables (Schema Splitting)
What is Schema Splitting?
When you create a table schema with more than 8 columns, the API automatically splits it into multiple parts. This improves RAG performance by creating smaller, more focused chunks for retrieval.
Understanding the Return Type
The create() method returns:
- Single object (
SchemaMetadataResponse) for schemas with ≤8 columns - List (
List[SchemaMetadataResponse]) for schemas with >8 columns
Handling Split Results
Always check the return type:
result = client.schema_metadata.create(
project_id=project.id,
name="Large Customer Table",
schema_data={
"table": {
"name": "customers",
"columns": [
{"name": "id", "type": "INTEGER"},
{"name": "email", "type": "VARCHAR(255)"},
{"name": "first_name", "type": "VARCHAR(100)"},
{"name": "last_name", "type": "VARCHAR(100)"},
{"name": "phone", "type": "VARCHAR(20)"},
{"name": "address", "type": "VARCHAR(255)"},
{"name": "city", "type": "VARCHAR(100)"},
{"name": "state", "type": "VARCHAR(50)"},
{"name": "zip", "type": "VARCHAR(10)"},
{"name": "country", "type": "VARCHAR(50)"},
]
}
}
)
# Handle both cases
if isinstance(result, list):
print(f"Schema split into {len(result)} parts")
split_group_id = result[0].split_group_id
for part in result:
print(f"Part {part.split_index}/{part.total_splits}: {part.id}")
else:
print(f"Single schema created: {result.id}")
Retrieving Split Groups
Get all parts of a split schema:
# Get complete split group information
split_group = client.schema_metadata.get_split_group(
project_id=project.id,
split_group_id=split_group_id
)
print(f"Total parts: {split_group['total_parts']}")
for part in split_group['parts']:
print(f"Part {part.split_index}: {part.name}")
Deletion Behavior
Deleting any part of a split schema automatically deletes all parts in the group:
# Delete just one part - all parts are removed
client.schema_metadata.delete(
project_id=project.id,
schema_metadata_id=result[0].id # Deletes entire group
)
Best Practices
- Always use
isinstance()to check if result is a list - Store
split_group_idif you need to retrieve all parts later - Delete any single part - no need to delete each part individually
- Consider column organization - group related columns in first 8 for better relevance
See Also
- How To → Bulk Operations: Learn about
bulk_create()with split schemas - How To → Validation: Required nested fields validation
Feedback
- Submit and view feedback for this page
- Send feedback about Text 2 Everything Python Documentation to cloud-feedback@h2o.ai