HLD: Mini Google Drive (File Storage System)

L4 scoping note. This is NOT "design GFS." This is "design a file storage service for ~100M users at 10 GB avg." Emphasize the split between metadata (structured, small, queryable) and blob storage (unstructured, huge, content-addressed). Chunking and resumable uploads are the meaty deep dives. Deduplication and sharing are great follow-ups. Do NOT design a consensus protocol for metadata -- use boring, reliable tech.

Understanding the Problem

What is Google Drive?

A cloud file storage service. Users upload files of arbitrary size, organize them in folders, share with others, and access from multiple devices. Think Dropbox, Google Drive, OneDrive. The interview value: it's a meaty full-stack system with interesting trade-offs in chunking, dedup, metadata/blob split, permissions, and large-file handling.

Functional Requirements

Core (above the line):

Upload -- upload files (small and large, up to several GB), supporting resumable uploads.
Download -- retrieve files by path or ID.
List / navigate -- list files and folders, navigate directory tree.
Delete -- remove files (soft delete / trash with eventual purge).
Share -- share a file with another user (view or edit) or generate a public link.

Below the line (out of scope):

Real-time collaborative editing (that's Docs/Sheets, a different product)
Offline sync with conflict resolution on multiple devices (mention if asked)
Server-side file preview / thumbnail generation (separate service)
Full-text search across file contents (separate pipeline)
Versioning beyond a simple "keep last N versions"
Zero-knowledge end-to-end encryption (possible but out of scope for L4)

Non-Functional Requirements

Core:

Durability -- 11 nines ("eleven 9s" = 99.999999999%). Data loss is catastrophic.
Availability -- 99.9% for reads/writes. Downtime is annoying but not fatal.
Scale -- 100M users * 10 GB avg = 1 EB total. Upload throughput: 10K uploads/sec peak. Download throughput: 100K/sec (read-heavy).
Large file support -- individual files up to 5 GB. Resumable uploads across flaky networks.

Below the line:

Global sub-100ms file metadata access from every continent (mention CDN for blobs, skip deep multi-region metadata design)
Sub-second list consistency across devices (eventual is fine within seconds)

L4 sanity check: 1 EB is big storage but it's just a numbers game -- S3 / GCS does this today. 10K uploads/sec and 100K downloads/sec across thousands of machines. The architecture is conceptually simple; rigor is in the details of each layer.

The Set Up

Core Entities

Entity	Description
User	`userId`, `email`, `quotaBytes`, `usedBytes`
File	`fileId`, `ownerId`, `parentFolderId`, `name`, `size`, `contentHash`, `createdAt`, `updatedAt`, `trashed`
Folder	`folderId`, `ownerId`, `parentFolderId`, `name`
Chunk	`chunkId = SHA256(contents)`, `size`, `blobLocation`
FileChunk	Join table: `fileId`, `chunkIndex`, `chunkId` -- the ordered list of chunks making up a file
Permission	`fileId`, `granteeUserId` (or `publicToken`), `role` (viewer/editor), `expiresAt`

The API

Initiate a resumable upload:

POST /api/files/upload/init
Authorization: Bearer <token>
Content-Type: application/json

{
  "name": "report.pdf",
  "parentFolderId": "f_123",
  "size": 48234782,
  "contentHash": "sha256:abcd1234...",  // optional, enables dedup
  "mimeType": "application/pdf"
}

Response: 200 OK
{
  "uploadId": "u_xyz987",
  "uploadUrl": "https://uploads.example.com/u/xyz987",
  "chunkSize": 4194304,  // 4 MB
  "expiresAt": "2026-04-20T10:00:00Z"
}

POST because initiating creates state on the server.

Upload a chunk:

PUT /uploads/{uploadId}/chunks/{chunkIndex}
Content-Type: application/octet-stream
Content-Range: bytes 0-4194303/48234782

<binary chunk data>

Response: 200 OK
{
  "chunkIndex": 0,
  "received": true,
  "nextExpected": 1
}

Complete the upload:

POST /api/files/upload/{uploadId}/complete

Response: 201 Created
{
  "fileId": "f_new123",
  "name": "report.pdf",
  "size": 48234782,
  "contentHash": "sha256:abcd1234..."
}

Download a file:

GET /api/files/{fileId}/download
Authorization: Bearer <token>

Response: 302 Found
Location: https://<signed-blob-url>?token=...&expires=...

The response is a signed URL to the blob store (S3 presigned URL or GCS signed URL). Browser fetches the content directly -- bypasses our app servers entirely for the download bytes.

List folder contents:

GET /api/folders/{folderId}/contents?cursor=<opaque>&limit=100

Response: 200 OK
{
  "folder": { "folderId": "f_123", "name": "Projects" },
  "entries": [
    { "type": "file", "fileId": "f_abc", "name": "spec.doc", "size": 12034, "updatedAt": "..." },
    { "type": "folder", "folderId": "f_xyz", "name": "Archive" },
    ...
  ],
  "nextCursor": "...",
  "hasMore": true
}

Share a file:

POST /api/files/{fileId}/permissions
{
  "granteeEmail": "bob@example.com",
  "role": "viewer"
}

Response: 200 OK
{ "permissionId": "p_123" }

Delete (soft):

DELETE /api/files/{fileId}

Moves to trash. Permanent delete after 30 days via a background job.

High-Level Design

[Client] -> [CDN] -> [API gateway] -> [Metadata service] -> [PostgreSQL (metadata)]
                                              |
                                              +----------> [Auth service]
                                              |
                          [Upload coordinator] -> [Blob storage (S3/GCS)]
                                                        ^
                                                        |
                                          [Garbage collector] (periodic)

Flow 1: Upload a file (large, resumable)

Client calls POST /upload/init with file metadata.
Metadata service creates an UploadSession row (uploadId, fileId draft, total size, received chunks map) in Postgres.
Returns uploadUrl pointing to the Upload Coordinator.
Client splits the file into 4 MB chunks. For each chunk: a. Compute SHA-256 of chunk contents. b. PUT /uploads/{uploadId}/chunks/{i}. c. Upload Coordinator:
- Writes chunk to blob storage at key chunks/<sha256>.
- Updates UploadSession.receivedChunks[i] = sha256.
- Returns 200.
On network failure, client resumes from last confirmed chunk -- state is preserved on server.
Once all chunks received, client calls POST /upload/{uploadId}/complete.
Upload Coordinator: a. Validates all chunks received. b. Creates a File row in Postgres with finalized metadata. c. Inserts FileChunk rows linking the file to its ordered chunks. d. Marks UploadSession complete (or deletes it).

Flow 2: Download a file

Client calls GET /files/{fileId}/download.
Metadata service checks permissions (caller is owner or has a valid Permission row).
Metadata service looks up the chunk list for the file.
Two paths:
- Simple (small files): stream the file as a single response, concatenating chunks from blob storage.
- Better (large files): for single-chunk files (small) or if the client supports it, return a 302 redirect to a signed blob URL. Browser downloads directly from blob storage. Saves app-server bandwidth.
For multi-chunk files via signed URLs: either
- Reassemble chunks into a single blob at upload completion (copy-on-complete), OR
- Return a list of signed URLs for the chunks and let the client concatenate.
- OR: use a tiny "chunk concatenation proxy" that streams from blob store to client.

Most implementations store files as single blobs post-upload-complete (copy-on-complete) for download simplicity. Trade-off discussed below.

Flow 3: List a folder

Client calls GET /folders/{folderId}/contents.
Metadata service queries SELECT * FROM files WHERE parent_folder_id = ? AND NOT trashed and similar for folders.
Cursor-paginated by (updated_at, fileId).
Permissions checked at the folder level.

Caller POSTs a permission.
Metadata service inserts a Permission row.
Recipient's next list/access call sees the shared file via the permission check.

Flow 5: Delete + GC

Client DELETEs a file. File.trashed = true, File.trashedAt = now().
Not immediately removed from chunk storage -- files in trash are recoverable for 30 days.
Background Garbage Collector job:
- Finds files with trashedAt < now - 30 days.
- Deletes FileChunk rows.
- For each chunk, checks if any other file references it (SELECT COUNT(*) FROM file_chunks WHERE chunk_id = ?). If zero, delete the chunk blob.
- Deletes the File row.

Potential Deep Dives

1) Metadata service vs blob storage split

This is THE structural decision. Articulate it early.

Good Solution: Split

Metadata in PostgreSQL: small, structured, queryable. Files, folders, permissions, upload sessions. Tens of KB per user. Millions of rows -- a single Postgres cluster handles it.
Blob storage in S3 / GCS / Bigtable (if internal at Google): bytes. Huge. Key-value access pattern only.
Why: they have totally different access patterns. SQL is great at joins and filtering; it's terrible at storing EB-scale bytes. S3 is great at EB of bytes; it's terrible at "find all files Bob can edit."

Challenges

Two systems to keep in sync. If the blob write succeeds but the metadata write fails, you have an orphan chunk. Periodic GC handles it.
Transactional guarantees are weaker than single-DB. Accept it; this is the industry standard pattern.

2) Chunking: why and how big?

Bad Solution: Single-blob per file

Approach: Upload files as a single blob. Done.
Challenges: Resume after failure means re-uploading the whole file. A 2 GB file over flaky wifi is painful. Dedup impossible at sub-file granularity.

Good Solution: Fixed-size chunks (4 MB)

Approach: Split files into 4 MB chunks. Each chunk uploaded independently. Metadata tracks ordered chunk list.
Why 4 MB? Small enough that a failed chunk upload is cheap to retry. Large enough that per-chunk overhead (HTTP headers, auth, etc.) is small fraction. Matches GCS / S3 multipart upload chunk size.
Pros: Resumable by chunk. Parallel chunk upload possible. Dedup at chunk level.

Great Solution: Fixed 4 MB chunks + content-addressable storage

Approach: Name each chunk by its content hash (SHA-256). On upload, if the chunk hash already exists in blob storage, skip the upload ("already have it"). This is dedup.
Savings: If 1000 users upload the same corporate template, we store it ONCE.
Reality: Google/Dropbox reports 30-50% storage savings from dedup on typical corporate workloads.

L4 note: Mention variable-size chunking (content-defined chunking via rolling hash like rsync / Rabin fingerprint) as an advanced option for deduplicating near-identical files. Don't go deeper.

3) Resumable upload protocol

The problem

A user's network drops mid-upload of a 2 GB file. We must NOT make them start over.

Good Solution: Google Resumable Upload-style protocol

Init: client calls POST /upload/init, gets an uploadId and uploadUrl.
Upload chunks: PUT {uploadUrl}/chunks/{index} with Content-Range header (standard HTTP). Server stores which chunks received.
Resume: on reconnect, client sends GET {uploadUrl}/status and server responds with the highest contiguous chunk received. Client continues from there.
Complete: POST {uploadUrl}/complete finalizes. Server validates all chunks present.
Expiry: upload sessions expire after 24 hours -- GC unfinished sessions and their orphan chunks.

State storage

UploadSession row in Postgres: uploadId, userId, fileName, size, receivedChunks (bitmap or list of indices), expiresAt.
For large files with many chunks, use a bitmap for efficient storage.

Challenges

Idempotency: a retry of chunk N must not overwrite existing chunk N with different content. Use the chunk hash as the blob key -- retries converge to the same blob.
Partial failures at finalize: if complete fails after inserting File row but before inserting all FileChunk rows, we have a half-state. Use a single transaction.

4) Deduplication design

Good Solution: Chunk-level content-addressing

Before uploading a chunk, client computes SHA-256 locally and asks HEAD /chunks/{hash}.
If blob exists, skip upload. Client proceeds as if chunk was uploaded successfully.
Metadata records the reference.

Challenges

Reference counting: when a file is deleted, we must not delete chunk blobs that other files still reference. Options:
- Reference-counted blobs: on each add/remove, update a counter. Race conditions possible without a transactional store.
- Mark-and-sweep GC: periodically scan metadata for all referenced chunks, delete blobs not referenced. Slower but simpler. Used by large systems.
Security concern: content-addressing means anyone who happens to have the same file content gets a "free" upload. That's fine -- they already had the content. But cross-account dedup can leak presence info ("is this file on the system?"). Usually scope dedup within an account.

For L4, per-account dedup with mark-and-sweep GC is the safe answer.

5) Database schema and scaling

Schema sketch

sql

CREATE TABLE users (
  user_id        UUID PRIMARY KEY,
  email          TEXT UNIQUE NOT NULL,
  used_bytes     BIGINT DEFAULT 0,
  quota_bytes    BIGINT DEFAULT 16106127360  -- 15 GB
);

CREATE TABLE files (
  file_id           UUID PRIMARY KEY,
  owner_id          UUID REFERENCES users,
  parent_folder_id  UUID,
  name              TEXT NOT NULL,
  size              BIGINT,
  content_hash      TEXT,
  mime_type         TEXT,
  trashed           BOOLEAN DEFAULT FALSE,
  trashed_at        TIMESTAMP,
  created_at        TIMESTAMP DEFAULT NOW(),
  updated_at        TIMESTAMP DEFAULT NOW()
);
CREATE INDEX ON files (parent_folder_id) WHERE NOT trashed;
CREATE INDEX ON files (owner_id, updated_at);

CREATE TABLE file_chunks (
  file_id     UUID,
  chunk_index INT,
  chunk_id    TEXT,  -- SHA-256 hex
  PRIMARY KEY (file_id, chunk_index)
);

CREATE TABLE permissions (
  permission_id  UUID PRIMARY KEY,
  file_id        UUID,
  grantee_id     UUID,   -- nullable if public link
  public_token   TEXT,   -- nullable
  role           TEXT,   -- 'viewer' | 'editor'
  expires_at     TIMESTAMP
);
CREATE INDEX ON permissions (grantee_id);
CREATE INDEX ON permissions (file_id);

CREATE TABLE upload_sessions (
  upload_id       UUID PRIMARY KEY,
  user_id         UUID,
  file_name       TEXT,
  size            BIGINT,
  received_chunks BYTEA,  -- bitmap
  expires_at      TIMESTAMP
);

Scaling strategy

Start: single Postgres cluster (primary + read replicas). Handles millions of users.
Sharding: shard by user_id. Each shard owns all files/folders/permissions for a user-id range. Cross-user queries (shared-with-me) become a federation. Mention this is a later step when the primary can't keep up.
Partition key choice: sharding by user works because the most common query patterns are scoped to a user. "Shared with me" is the tricky query -- one option is to materialize a separate "shared_with_me" table per user that's written when a permission is granted.

6) Permissions model

Good Solution: Direct permission rows

Each share = one row: (file_id, grantee_id, role).
On access, check: owner? grantee? Public link with valid token?
Indexed on both file_id and grantee_id.

Great Solution: Inheritance + explicit override

Folders have permissions that children inherit. Editing a folder's permission updates all children in effect.
Permission rows stored on the nearest ancestor; access check walks up the folder tree (bounded by tree depth, usually small).
Explicit overrides at the file level beat inherited permissions.

For L4, direct-per-file is fine and widely used. Mention inheritance as an extension.

A Permission row with public_token (random string) and no grantee_id.
URL format: https://drive.example.com/p/{public_token}.
Revoke by deleting the row.

7) Consistency concerns

Blob before metadata: always write chunks to blob storage first, THEN write metadata. If metadata write fails, the orphan chunk is cleaned up by GC (content-addressed, might even be reused by another file).
Listings: eventually consistent with uploads. A file just uploaded might not appear in listings for a few seconds if we have read replicas. Usually acceptable.
Strong consistency for permission revocation: when a share is revoked, the check must reflect it immediately. Read from primary, not replica, for permission checks.

8) Bandwidth and CDN

Downloads are huge volume. App servers should NEVER stream bytes if avoidable.
Use signed URLs to blob storage so the browser downloads directly.
For public files, CDN in front of the blob store (CloudFront / Cloud CDN). Cache-Control aligned with file mutability.

9) What NOT to design at L4

Don't design your own blob store (GFS / Colossus). Use GCS / S3 as a given.
Don't design GeoReplication for EB data -- blob stores already do this.
Don't design a real-time sync protocol -- that's Google Drive desktop client, separate concern.
Don't do end-to-end encryption unless explicitly asked; it complicates sharing, search, and dedup significantly.

What is Expected at Each Level

L3 / Mid-level

Separate metadata and blob storage.
Basic upload/download/list APIs.
Might propose simple single-blob upload without resumable protocol.
Permissions as a simple table.

L4

Chunk-based storage with resumable upload.
Content-addressed chunks for deduplication.
Signed URLs for download (app servers don't stream bytes).
Schema with appropriate indexes, back-of-envelope on row counts.
Soft delete + GC for chunks.
Discussion of the consistency model (orphan chunks, metadata-vs-blob ordering).
Permission checks with indexed lookup on both directions.

L5 / Senior

Variable-size / content-defined chunking for cross-file dedup.
Multi-region strategy: regional metadata primary with async replication, blob store already multi-region.
Reference counting vs mark-sweep trade-offs in GC.
Schema for large-scale sharing: materialized "shared with me" views, handling the fan-out query.
Operational concerns: EB-scale cost accounting, storage-class tiering (hot/cold/archive), quota enforcement race conditions.
Client sync protocol (events, tokens for resumable sync).
E2E encryption trade-offs (no server-side preview, no dedup, harder sharing).

HLD: Mini Google Drive (File Storage System) ​

Understanding the Problem ​

What is Google Drive? ​

Functional Requirements ​

Non-Functional Requirements ​

The Set Up ​

Core Entities ​

The API ​

High-Level Design ​

Flow 1: Upload a file (large, resumable) ​

Flow 2: Download a file ​

Flow 3: List a folder ​

Flow 4: Share ​

Flow 5: Delete + GC ​

Potential Deep Dives ​

1) Metadata service vs blob storage split ​

Good Solution: Split ​

Challenges ​

2) Chunking: why and how big? ​

Bad Solution: Single-blob per file ​

Good Solution: Fixed-size chunks (4 MB) ​

Great Solution: Fixed 4 MB chunks + content-addressable storage ​

3) Resumable upload protocol ​

The problem ​

Good Solution: Google Resumable Upload-style protocol ​

State storage ​

Challenges ​

4) Deduplication design ​

Good Solution: Chunk-level content-addressing ​

Challenges ​

5) Database schema and scaling ​

Schema sketch ​

Scaling strategy ​

6) Permissions model ​

Good Solution: Direct permission rows ​

Great Solution: Inheritance + explicit override ​

Public sharing ​

7) Consistency concerns ​

8) Bandwidth and CDN ​

9) What NOT to design at L4 ​

What is Expected at Each Level ​

L3 / Mid-level ​

L4 ​

L5 / Senior ​

HLD: Mini Google Drive (File Storage System)

Understanding the Problem

What is Google Drive?

Functional Requirements

Non-Functional Requirements

The Set Up

Core Entities

The API

High-Level Design

Flow 1: Upload a file (large, resumable)

Flow 2: Download a file

Flow 3: List a folder

Flow 4: Share

Flow 5: Delete + GC

Potential Deep Dives

1) Metadata service vs blob storage split

Good Solution: Split

Challenges

2) Chunking: why and how big?

Bad Solution: Single-blob per file

Good Solution: Fixed-size chunks (4 MB)

Great Solution: Fixed 4 MB chunks + content-addressable storage

3) Resumable upload protocol

The problem

Good Solution: Google Resumable Upload-style protocol

State storage

Challenges

4) Deduplication design

Good Solution: Chunk-level content-addressing

Challenges

5) Database schema and scaling

Schema sketch

Scaling strategy

6) Permissions model

Good Solution: Direct permission rows

Great Solution: Inheritance + explicit override

Public sharing

7) Consistency concerns

8) Bandwidth and CDN

9) What NOT to design at L4

What is Expected at Each Level

L3 / Mid-level

L4

L5 / Senior