MyHealth Portal - Schema Architecture

Introduction

Today is all about structuring data. We are moving from the high-level boxes we drew yesterday to the actual JSON structures that will live in our database. We are using Google Cloud Firestore, a NoSQL, document-oriented database.

Design Goals for Firestore

Read Optimization

Design for how the app reads data, not how it writes it.

Security

The structure must make it easy to write security rules to keep patient data private.

Flexibility

The structure needs to handle the varied extracted data coming from the AI.

The Database Structure (High-Level View)

We will use two top-level collections and one sub-collection.

Creation Root
├── users/ (Collection)
│   ├── {userId_doctor}/ (Document)
│   └── {userId_patient}/ (Document)
│       └── lab_reports/ (Sub-Collection)
│           └── {reportId}/ (Document - The AI Data)
└── appointments/ (Collection)
    └── {appointmentId}/ (Document)

Detailed Schema Definitions

Here are the specific JSON structures for each document type.

1. The Users Collection (users)

We use the Firebase Authentication UID as the Document ID. This is crucial for joining auth data with profile data easily.

Why this structure? We use a role field to differentiate views in the frontend. For patients, we explicitly store the assignedDoctorId. This is the "foreign key" that allows doctors to query "give me all patients assigned to me."

// Path: /users/{userId}
{
  "userId": "uid_abc123",        // Matches Auth UID
  "email": "patient@test.com",
  "displayName": "John Doe",
  "role": "patient",           // "patient" | "doctor"
  "createdAt": "2023-10-27T10:00:00Z",
  
  // If Role == Patient
  "assignedDoctorId": "uid_doc456",
  "dateOfBirth": "1985-01-01",
  
  // If Role == Doctor (Optional)
  "specialty": "General Practitioner"
}

2. The Lab Reports Sub-Collection (users/{uid}/lab_reports)

This is where the "magic" happens. When the Cloud Function + Document AI finishes processing the PDF, it dumps the result here.

Why a sub-collection? It makes security rules very simple: A user can only read documents inside their own lab_reports sub-collection.
Why the parsedData map? Since different labs have different tests, we use a flexible Map (object) structure to store whatever the AI finds. We don't know in advance if it will be "Glucose" or "Hemoglobin".

// Collection: users/{patientUid}/lab_reports
// Document ID: <Auto-generated ID>
{
  // --- Metadata about the file ---
  "uploadTimestamp": "2023-10-28T14:30:00Z", // Timestamp
  "status": "processed", // ENUM: "uploading", "processing", "processed", "failed"
  
  // The path to the actual PDF in Cloud Storage if they want to download original
  "storagePath": "gs://bucket-name/uploads/dr_id/pat_id/report1.pdf",
  "fileName": "Oct28_BloodWork.pdf",

  // --- The AI extracted data ---
  // We attempt to extract the date the actual test happened from the PDF
  "testDate": "2023-10-26",

  // Crucial: This is the unstructured data turned structured by AI.
  // It's a flexible map of Key:Value pairs.
  "parsedData": {
    "Hemoglobin": "14.2 g/dL",
    "White Blood Cell Count": "5.5 K/uL",
    "Platelets": "250 K/uL",
    "Glucose, Random": "92 mg/dL",
    "Cholesterol, Total": "185 mg/dL"
  }
  // Note: For an MVP, simple string values are fine.
  // Later, we could make these objects {"value": 14.2, "unit": "g/dL"} for graphing.
}

3. The Appointments Collection (appointments)

This is a "join" document connecting the two actors for a specific event.

Why denormalize names? We store patientName and doctorName directly on the appointment document. This is a Firestore best practice called denormalization. It means when loading the Doctor's calendar view, we don't have to perform a separate read on the users collection just to get the patient's name for the calendar entry. It saves reads and speeds up the UI.

// Collection: appointments
// Document ID: <Auto-generated ID>
{
  // Links to the users
  "doctorId": "<UID_of_Dr_Smith>",
  "patientId": "<UID_of_Sarah_Jones>",

  // Denormalized data (copied for fast read access without joining)
  "patientName": "Sarah Jones",
  "doctorName": "Dr. Smith",

  // Scheduling details
  "scheduledStart": "2023-11-01T09:00:00Z", // Timestamp
  "scheduledEnd": "2023-11-01T09:30:00Z", // Timestamp
  "status": "confirmed", // ENUM: "pending", "confirmed", "cancelled"
  "notes": "Follow-up discussion on Oct 28 blood work results."
}

Visual Architecture

Security Rules Preview (Crucial for Healthcare)

While we won't write the exact rules today, our database design must support them. Here is how this design secures the data:

Users A user can only read/write their own document (request.auth.uid == userId).
Lab Reports A patient can only read reports inside their own sub-collection. A Doctor can read reports if they are the assignedDoctorId of the parent user document.
Appointments A user can read an appointment only if their UID matches either the doctorId OR the patientId field of that document.

Summary of Day 2

We have successfully mapped our application requirements to a scalable, secure Firestore schema. We know exactly what data goes where and how the AI's output will be stored.

Next Steps: Preparing the Cloud Function triggers for file ingestion.