Example Notebooks
Ready-to-use Databricks notebooks demonstrating end-to-end TitanRDM SDK workflows. Import these directly into your Databricks workspace.
Available Notebooks
| Notebook | Description | Download |
| SparkSync Example | Automated upload/download using SparkSync | databricks_spark_sync_example.py |
| Convention Sync Example | Manual convention-based sync with full control | databricks_sync_example.py |
| SDK System Tests | Comprehensive test of all SDK methods | databricks_system_tests.py |
Importing Notebooks into Databricks
- Download the
.pyfile from the links above - In Databricks, navigate to Workspace
- Click Import
- Select the downloaded
.pyfile - The file will be imported as a Databricks notebook automatically
These files use the Databricks notebook source format (
# Databricks notebook source) and are recognised natively.
SparkSync Example
File: databricks_spark_sync_example.py
This notebook demonstrates the SparkSync class with four scenarios:
| # | Direction | Scope |
| 1 | Upload | Entire domain (Clinics) — all deployed tables |
| 2 | Upload | Specific tables (Sites, Delivery Centre, Org Unit) |
| 3 | Download | Entire domain (Clinics) — all deployed tables |
| 4 | Download | Specific tables (Sites, Delivery Centre, Org Unit) |
Key Concepts
- Uses
SparkSyncfor automatic catalog read/write - Configurable via Databricks widgets (
branch_name,catalog,download_schema,upload_schema) - Credentials loaded from Databricks secret scope
titan-rdm
Notebook Walkthrough
Setup:
from titan_rdm_sdk import TitanRDMClient
from titan_rdm_sdk.spark_sync import SparkSync
client = TitanRDMClient(
url=dbutils.secrets.get(scope="titan-rdm", key="url"),
client_id=dbutils.secrets.get(scope="titan-rdm", key="client_id"),
client_secret=dbutils.secrets.get(scope="titan-rdm", key="client_secret"),
)
branch = client.get_branch_by_name("dev")
sync = SparkSync(client=client, spark=spark)
Upload entire domain:
upload_results = sync.upload_sync_by_convention(
branch_id=branch.id,
source_catalog="hive_metastore",
source_schema="rdmout",
target_domain_name="Clinics",
)
Download entire domain:
download_results = sync.download_sync_by_convention(
branch_id=branch.id,
target_catalog="hive_metastore",
target_schema="rdmin",
source_domain_name="Clinics",
)
Upload specific tables:
upload_results = sync.upload_sync_by_convention(
branch_id=branch.id,
source_catalog="hive_metastore",
source_schema="rdmout",
target_domain_name="Clinics",
target_table_names=["Site", "Delivery Centre", "Org Unit"],
)
Convention Sync Example
File: databricks_sync_example.py
This notebook demonstrates the manual convention-based approach — giving you full control over the sync loop while still following the naming convention.
Key Concepts
- Discovers all domains and deployed tables automatically
- No hard-coded table lists — adding a table in TitanRDM includes it in the next sync
- Manual control over the upload/download loop
- Single import batch for all tables
Notebook Walkthrough
Discover metadata:
domains = client.get_domains()
sync_manifest = []
for domain in domains:
tables = client.get_deployed_table_definitions(
branch_id=branch.id,
domain_id=domain.id,
)
for t in tables:
sync_manifest.append((domain, t))
Upload all tables:
upload = client.get_upload(
branch_id=branch.id,
description="Convention sync upload",
correlation_code="convention-upload",
)
for domain, table in sync_manifest:
source_table = f"{CATALOG}.{UPLOAD_SCHEMA}.{domain.abbreviation}_{table.database_table_name}"
source_df = spark.table(source_table).toPandas()
import_mapping = client.get_default_import_mapping(table.id)
table_upload = upload.get_table_upload(
table_definition_key=table.key,
import_mapping_key=import_mapping.key,
pattern="full",
)
table_upload.send(source_df)
upload.complete(message="Convention sync upload completed")
Download all tables:
for domain, table in sync_manifest:
target_table = f"{CATALOG}.{DOWNLOAD_SCHEMA}.{domain.abbreviation}_{table.database_table_name}"
download = client.get_download(
branch_id=branch.id,
table_definition_key=table.key,
pattern="full",
)
download.wait_until_ready(poll_interval=2.0, max_wait=300.0)
df = download.receive()
spark_df = spark.createDataFrame(df)
spark_df.write.mode("overwrite").option("overwriteSchema", "true").saveAsTable(target_table)
SDK System Tests
File: databricks_system_tests.py
A comprehensive test notebook that exercises all SDK methods against a real TitanRDM instance. Use this to verify your environment is configured correctly.
Tests Included
| Test | Description |
| 1 | Authentication |
| 2 | Full upload workflow (create batch → upload → complete) |
| 3 | Full download workflow (create export → wait → receive) |
| 4 | Incremental upload |
| 5 | Error handling (invalid IDs) |
| 6 | List branches |
| 7 | Get branch by name |
| 8 | Get branch by ID |
| 9 | List domains |
| 10 | Get domain by name and ID |
| 11 | List deployed table definitions |
| 12 | Get deployed table definition by key |
Configuration Widgets
| Widget | Default | Description |
branch_id | 174 | Target branch for testing |
table_definition_key | 100 | Table for upload/download tests |
import_mapping_key | 10 | Mapping for upload tests |
test_csv_path | /dbfs/test_data/customers.csv | Path to test CSV |
Prerequisites (All Notebooks)
1. Secret Scope
databricks secrets create-scope --scope titan-rdm
databricks secrets put --scope titan-rdm --key url
databricks secrets put --scope titan-rdm --key client_id
databricks secrets put --scope titan-rdm --key client_secret
2. SDK Installation
%pip install titan-rdm-sdk
3. Schemas (for Sync Notebooks)
CREATE SCHEMA IF NOT EXISTS rdmin;
CREATE SCHEMA IF NOT EXISTS rdmout;
Next Steps
- Spark Sync — Detailed SparkSync documentation
- Convention Sync — Understanding the naming convention
- SDK Client Methods — Full method reference
- Getting Started — Installation and authentication