Loading Now

Rethinking Data Modeling: How GitHub Copilot Is Changing the Way We Design Systems

Every engineer reaches a stage in their career when data modelling becomes less straightforward.

What begins as a clear schema soon becomes complex. Relationships can get blurred, query paths expand, and minor changes can lead to significant repercussions.

This is when the true challenge of data modelling emerges: it’s not just about storing data but maintaining clarity as systems grow.

In this context, GitHub Copilot turned out to be a game changer for us.

WHEN DATA MODELS NO LONGER ALIGN WITH OUR THINKING
Data modelling issues rarely stem from SQL syntax mistakes but rather from architectural choices.

As architects and data engineers, we often ponder:

  • What are the genuine domain entities compared to temporary implementation details?
  • Which relationships need to be strict, and which can remain adaptable?
  • How can we model for today while ensuring it won’t break six months down the line?

In the past, we tackled these challenges with whiteboards, manual DDL rewrites, and lengthy iterative cycles. While effective, this process was painfully slow.

GitHub Copilot changed the dynamics. It streamlined the gap between our intentions and a solid model draft.

Using GitHub Copilot Chat in VS Code, we benefited from inline suggestions during DDL refinements, while the Chat mode enhanced our architectural reasoning and constraint validation.

A REAL-WORLD EXAMPLE
SaaS Control Plane:

A SaaS control plane acts as the administrative core of a multi-tenant platform. It operates independently from the actual application workload (the data plane) and manages tenant lifecycle: who gets access to what, under which conditions, in which environment, and in what deployment status. It essentially acts as the system providing answers to “who is doing what, where, and in what state” at any moment.

Multi-tenant SaaS products cannot afford to oversee tenant provisioning, subscription rights, and infrastructure distribution manually at scale. The control plane consistently automates and tracks these processes with full reliability and audit capabilities.

We leveraged Copilot to accelerate a control-plane data model, where:

  1. A tenant is onboarded
  2. The tenant chooses an application
  3. The tenant selects a service tier
  4. The tenant picks a target environment
  5. A deployment is coordinated
  6. Necessary infrastructure is provisioned
  7. Deployment and infrastructure information is logged for audit and operations

This was not a trivial schema. Every entity included lifecycle states, ownership boundaries, and interactions with data engineering and observability.

CHALLENGES WE FACED WITHOUT COPILOT
Boundary Confusion
Initially, we blurred the lines between desired state (what should happen) and runtime state (what actually occurred), causing inconsistent query behaviour and weakened audit trails.

Relationship Drift
As we introduced deployment history and infrastructure tracking, our initial assumptions about cardinality shifted. What began as a one-to-one relationship turned into one-to-many after redeployments and rollbacks.

Iteration Costs
Every update meant manual rewrites across table definitions, constraints, and foreign key references. Naming inconsistencies became a recurring issue during reviews.

Team Alignment Friction
Discussions about architecture advanced quicker than our capacity to produce review-ready schema drafts.

HOW GITHUB COPILOT ASSISTED
Faster Initial Drafts
By outlining the onboarding and deployment processes in simple terms, we received a complete relational baseline swiftly. What previously required lengthy whiteboard sessions and multiple rewriting rounds was review-ready in just one focused afternoon.

Timely Pattern Recognition
Copilot consistently brought up beneficial structures:

  • A subscription pivot entity to link tenant, application, tier, and environment
  • Immutable records of deployment history
  • An infrastructure resource catalogue with resource typing
  • A deployment event trail for effective auditability

Rapid Refinement
When we needed to adjust our assumptions, Copilot updated the required structures consistently, sparing us the need to patch every related reference by hand.

Enhanced Design Discussions
Quicker draft appearances shifted reviews away from syntax corrections to evaluating architectural quality:

  • Is tenant isolation clear?
  • Is deployment history reliable?
  • Are infrastructure dependencies traceable?
  • Can data engineering create analytics without fragile joins?

CORE RELATIONSHIPS FINALIZED

  • Tenant to Tenant App Subscription: one-to-many
  • Application to Tenant App Subscription: one-to-many
  • Service Tier to Tenant App Subscription: one-to-many
  • Environment to Tenant App Subscription: one-to-many
  • Tenant App Subscription to Deployment: one-to-many
  • Deployment to Infra Resources: one-to-many
  • Deployment to Deployment Events: one-to-many

LOGICAL RELATIONSHIP VIEW
Tenant -> Tenant App Subscription
Application -> Tenant App Subscription
Service Tier -> Tenant App Subscription
Environment -> Tenant App Subscription
Tenant App Subscription -> Deployment
Deployment -> Infra Resources
Deployment -> Deployment Events

EXAMPLES OF PROMPTS WE USED
Prompt 1
Create a relational schema for a multi-tenant onboarding process where tenants choose an application, service tier, and environment, followed by deployment and infrastructure tracking. Include reasoning for relationships.

Prompt 2
Given this schema, identify potential cardinality issues and propose safer constraints for repetitive deployments.

Prompt 3
Refine the model so that deployment records represent immutable history, while the subscription retains the current desired state.

Prompt 4
Suggest naming and normalisation enhancements to allow data engineers to construct reliable analytics models from these tables.

Prompt 5
Highlight operational risks in this model and which table-level attributes could bolster troubleshooting and auditability.

Prompt 6
Tenant license generation is now mandatory before a tenant can access a chosen application. Suggest a minimal schema change to capture the issuance, validity, and status of licenses without affecting existing subscription and deployment records.

SHIFTING THOUGHT PROCESSES
Before Copilot, most of our efforts centred on transforming architectural ideas into SQL outputs.

With Copilot, that transformation became so quick that our main focus shifted to evaluation:

  • Is tenant isolation explicit in the schema or concealed within app logic?
  • Is deployment history append-only and secure?
  • Can infrastructure resources be tracked independently?
  • Is this model user-friendly for analytics teams?

This transition from writing to evaluating significantly improved the quality of our designs.

LESSONS FROM SCHEMA EVOLUTION
Two significant changes occurred mid-design:

  1. License generation was established as a required step before a tenant could access a selected application. The license needed to track the issuance date, validity period, and revocation status.

  2. Service-tier entitlements became time-sensitive.

In both cases, we communicated the evolution requirements through Copilot Chat using the existing schema for context. Copilot recommended additive changes: creating a new tenant license table linked to the subscription for the first instance, with license status and validity tracked separately, and a distinct entitlement history table for the second.

WHERE ENGINEERING JUDGMENT IS STILL ESSENTIAL
Copilot provides options but not certainties. It’s crucial for engineers to maintain ownership over:

  • The truth of the domain and lifecycle semantics
  • Cardinality accuracy
  • Performance and scaling trade-offs
  • Governance, compliance, and data retention
  • A backward-compatible migration strategy

All outputs from Copilot should undergo architecture review before they hit production.

FINAL THOUGHTS
Data modelling remains an architectural discipline. What has evolved is the speed of iteration.

GitHub Copilot enables teams to translate ideas into structures more swiftly, compare options earlier, and prioritise design quality over simply drafting models.

The bottleneck is no longer in crafting schemas. It lies in clear thinking about the problem at hand. This is where AI-assisted modelling truly shines.

GIVE IT A GO YOURSELF

  1. Launch GitHub Copilot Chat in VS Code.
  2. Describe your primary entities and the lifecycle questions you have.
  3. Request at least two schema alternatives.
  4. Evaluate constraints and operational trade-offs.
  5. Share your insights with the community.

Share this content:


Discover more from Qureshi

Subscribe to get the latest posts sent to your email.

Discover more from Qureshi

Subscribe now to keep reading and get access to the full archive.

Continue reading