Skip to content

15 — Problem: In-Memory File System

Understanding the Problem

An in-memory file system models the tree of files and directories you'd find on disk, but the whole thing lives in process memory. No inodes, no block devices, no page cache — just a tree of nodes rooted at /, with operations to list a directory, create a directory, append content to a file, and read a file. LeetCode 588 and 1166 are the canonical versions. In an interview setting this problem is less about the algorithm and more about whether you can name the right pattern, draw a clean abstraction boundary, and hold the conversation when the interviewer pushes toward symlinks, permissions, or concurrent writers.

What is an In-Memory File System?

A tree where every node is either a file (a leaf holding content) or a directory (an internal node holding named children). Clients interact with it through string paths — /a/b/c — which the file system parses into a list of components and walks. This is the textbook Composite pattern: File and Directory share an abstract Node interface so traversal, listing, and future operations (delete, move, stat) can treat them uniformly. Interviewers use this problem to probe tree modeling, path parsing edge cases, abstraction boundaries (where does the parser end and the tree begin?), and extensibility seams for symlinks, permissions, and persistence. Get the Node hierarchy right and the rest of the design falls out.


Clarifying Questions

You: What's the path separator, and is there a root?

Interviewer: Unix-style. / is the separator; / is the root. All paths start with /.

You: Absolute paths only, or do we support relative paths and a current working directory?

Interviewer: Absolute only. No cwd, no ., no ...

You: Case-sensitive?

Interviewer: Yes. /A and /a are different.

You: Are there size limits on files? Max depth of the tree?

Interviewer: No hard limits for the core design. Assume files fit in memory. Mention how you'd bound things in production.

You: Permissions — owners, modes, ACLs?

Interviewer: Out of scope for the core API. Discuss as a follow-up.

You: Symlinks? Hard links?

Interviewer: Out of scope. Definitely discuss as a follow-up — it's a good extensibility question.

You: Move, copy, delete?

Interviewer: Not in the core API. Implement ls, mkdir, addContentToFile, readContentFromFile. Talk about rm and mv as follow-ups.

You: What are the exact semantics of ls on a file vs a directory?

Interviewer: On a directory, return a sorted list of children names. On a file, return a single-element list containing just the file name. This matches LeetCode 588.

You: What does addContentToFile do if the file already exists? If the parent directory doesn't exist?

Interviewer: Append if the file exists. Create the file (and any missing parent directories) if it doesn't. Think mkdir -p semantics for the parents.

You: Concurrency — single caller or multi-threaded?

Interviewer: Single-threaded for the core. Discuss concurrency as a follow-up.

You: Atomic writes — if addContentToFile is mid-write and another reader comes in, what do they see?

Interviewer: Single-threaded, so not an issue for the core. But yes, discuss when we get to concurrency.


Functional Requirements

  1. ls(path: string): string[] — List the contents of a path.
    • If path is a directory, return children names sorted lexicographically.
    • If path is a file, return a single-element array with just the file's name (not its content).
    • If path doesn't exist, throw / error. (LC 588 guarantees the path exists, but a production API should fail loudly.)
  2. mkdir(path: string): void — Create a directory, and any missing parent directories (mkdir -p).
    • Idempotent: creating a directory that already exists is a no-op.
    • Error if any component of the path is an existing file.
  3. addContentToFile(path: string, content: string): void — Write to a file.
    • Create the file if it doesn't exist, creating parent directories as needed.
    • Append content to the file if it already exists.
    • Error if path resolves to a directory.
  4. readContentFromFile(path: string): string — Return the full content of a file.
    • Error if path doesn't exist or resolves to a directory.

Out of scope: rm, mv, cp, permissions, symlinks, persistence, concurrent access (all discussed as follow-ups).


Non-Functional Requirements

DimensionTargetReasoning
Read latencyO(D) for traversal + O(K log K) for ls sort, where D = depth, K = children. Content reads are O(N) in file size.Tree walk is unavoidable; sort on ls is a spec requirement.
Write latencyO(D) for path traversal + O(C) append for content, where C = appended length.mkdir -p traversal cost is dominated by tree depth, not breadth.
Memory footprintO(total path chars + total content bytes). Per-node overhead dominated by children Map.A large flat directory with 10k files holds 10k Map entries in its parent.
ConcurrencyCore: single-threaded. Extension: per-file RWLock + lock-ordering for multi-file operations.Covered in Concurrency Considerations below.
ExtensibilityMust support symlinks, ACLs, quotas, and a disk-backed store with localized changes — ideally at the Node level.The Composite hierarchy is the extension surface.
Error behaviorFail loudly on bad paths, type mismatches (file-under-file), and read-from-directory.Silent failure in a file system is catastrophic.

Core Entities and Relationships

EntityRoleKey State
INode (abstract)Base type for anything in the tree. Enables Composite uniformity.name, type ("file" | "dir"), size
File (concrete INode)Leaf node. Holds content.content: string
Directory (concrete INode)Internal node. Holds named children.children: Map<string, INode>
PathParsed path — a sequence of components, plus absolute/root metadata.components: string[]
FileSystemFacade. Owns the root directory, parses paths, routes operations.root: Directory

Why Composite? ls doesn't care whether the thing at a path is a file or a directory — it just needs name, and polymorphic listContents(). mkdir -p walks the tree descending through Directory nodes and stopping if it hits a File. Treating File and Directory as sibling subtypes of a common INode lets every traversal be a single polymorphic walk. Without Composite you'd smear if (isFile) ... else ... checks across every operation — exactly the duplication the pattern exists to prevent.

Why keep Path separate? Path parsing is pure string manipulation. Tree traversal is pointer chasing. If you mix them, every operation has to re-parse on every call and the traversal code fills with string-slicing noise. Separating them means FileSystem.ls() reads as "parse, walk, list" — three lines instead of fifteen.


Interfaces

typescript
type NodeType = "file" | "dir";

interface INode {
  readonly name: string;
  readonly type: NodeType;
  readonly size: number; // bytes for files; number of children for dirs
}

interface IPath {
  readonly components: string[]; // [] means root "/"
  readonly raw: string;
  isRoot(): boolean;
  parent(): IPath;
  last(): string;
  join(component: string): IPath;
}

interface IFileSystem {
  ls(path: string): string[];
  mkdir(path: string): void;
  addContentToFile(path: string, content: string): void;
  readContentFromFile(path: string): string;
}

The INode interface is intentionally narrow — just enough for polymorphic reads. Concrete classes expose mutation operations (addContent, addChild) but clients only ever touch the IFileSystem facade, never INode directly. That boundary is what lets us add symlinks later without breaking callers.


Class Diagram

                      ┌──────────────────────┐
                      │     <<interface>>    │
                      │        INode         │
                      │  + name: string      │
                      │  + type: NodeType    │
                      │  + size: number      │
                      └──────────┬───────────┘

                 ┌───────────────┴───────────────┐
                 │                               │
        ┌────────▼─────────┐           ┌─────────▼──────────┐
        │      File        │           │     Directory      │
        │──────────────────│           │────────────────────│
        │ - content: str   │           │ - children:        │
        │ + append(s)      │           │    Map<str, INode> │
        │ + read(): str    │           │ + addChild(n)      │
        └──────────────────┘           │ + getChild(n)      │
                                       │ + listNames()      │
                                       └────────┬───────────┘
                                                │ 0..*
                                                │ (holds INode)

                                       back-edge to INode

              ┌──────────────────────────────────┐
              │          FileSystem              │
              │──────────────────────────────────│
              │ - root: Directory                │
              │──────────────────────────────────│
              │ + ls(path): string[]             │
              │ + mkdir(path): void              │
              │ + addContentToFile(path, c)      │
              │ + readContentFromFile(path): str │
              │ - parse(path): Path              │
              │ - traverse(path, createDirs): N  │
              └──────────────────────────────────┘

                   │ owns

              Directory (root)

                   │ children (Composite)

              File │ Directory │ ...

The core insight: the recursion structure lives in Directory.children holding INode — a directory's children are nodes, and a node might itself be a directory. That's the whole Composite pattern.


Class Design

File

State:

  • name: string — just the basename, e.g. "report.txt".
  • content: string — the raw content. Starts empty.

Methods:

  • append(text: string): void — append to content. O(C) in appended length.
  • read(): string — return content. O(1) reference return (string is reference-shared in JS).
  • size: number (getter) — content.length.

The File class doesn't know its own path. Path context is the FileSystem's job. If we stored parent pointers we'd need to update them on every move, and we'd break the simple uniform tree invariant.

Directory

State:

  • name: string
  • children: Map<string, INode> — name → child node. Map because we need O(1) lookup during traversal; children names must be unique.

Methods:

  • addChild(node: INode): void — throws if a child with that name already exists with a conflicting type.
  • getChild(name: string): INode | undefined — O(1) lookup.
  • hasChild(name: string): boolean
  • listNames(): string[] — sorted children names, for ls.
  • size: number (getter) — children.size.

Why Map instead of an Array? Traversal does a lookup on every path component. With 10k children in a directory, Array.find is O(N) per step — a deep path in a wide tree turns into a quadratic disaster. Map gives O(1). We pay ordering in exchange, but we only need ordering at ls time, which is a one-shot sort.

Path

State:

  • components: string[]["a", "b", "c"] for /a/b/c. Empty for /.
  • raw: string — original input, useful for error messages.

Methods:

  • isRoot(): booleancomponents.length === 0.
  • parent(): Path — drop the last component.
  • last(): string — the final component (basename).
  • join(component: string): Path — append a component.

Parsing is a static factory: Path.parse("/a/b/c"). It handles trailing slashes, collapses double slashes, and rejects empty / non-absolute input.

FileSystem

State:

  • root: Directory — singleton root named "".

Methods:

  • ls, mkdir, addContentToFile, readContentFromFile — the public API.
  • traverse(path: Path, createMissingDirs: boolean): INode — private helper. This is where the Composite tree walk lives. createMissingDirs toggles mkdir -p semantics.

The traverse helper is the single point of path-walking logic. Every public method ends up calling it, differing only in the createMissingDirs flag and what they do with the result. That's how you avoid bugs: one walk, many callers.


Key Methods

typescript
// ---- Path parsing ----

class Path implements IPath {
  constructor(
    public readonly components: string[],
    public readonly raw: string,
  ) {}

  static parse(raw: string): Path {
    if (typeof raw !== "string" || raw.length === 0) {
      throw new Error(`invalid path: ${JSON.stringify(raw)}`);
    }
    if (raw[0] !== "/") {
      throw new Error(`path must be absolute: ${raw}`);
    }
    // Split and filter empty segments — this collapses "//" and trailing "/".
    const components = raw.split("/").filter((c) => c.length > 0);
    for (const c of components) {
      if (c === "." || c === "..") {
        throw new Error(`path component ${c} not supported`);
      }
    }
    return new Path(components, raw);
  }

  isRoot(): boolean {
    return this.components.length === 0;
  }

  parent(): Path {
    if (this.isRoot()) throw new Error("root has no parent");
    return new Path(this.components.slice(0, -1), this.raw);
  }

  last(): string {
    if (this.isRoot()) throw new Error("root has no basename");
    return this.components[this.components.length - 1];
  }

  join(component: string): Path {
    return new Path([...this.components, component], `${this.raw}/${component}`);
  }
}

// ---- Nodes ----

class File implements INode {
  readonly type = "file" as const;
  private content: string = "";

  constructor(public readonly name: string) {}

  get size(): number {
    return this.content.length;
  }

  append(text: string): void {
    this.content += text;
  }

  read(): string {
    return this.content;
  }
}

class Directory implements INode {
  readonly type = "dir" as const;
  private readonly children: Map<string, INode> = new Map();

  constructor(public readonly name: string) {}

  get size(): number {
    return this.children.size;
  }

  addChild(node: INode): void {
    const existing = this.children.get(node.name);
    if (existing && existing.type !== node.type) {
      throw new Error(
        `cannot add ${node.type} "${node.name}": a ${existing.type} with that name exists`,
      );
    }
    if (!existing) this.children.set(node.name, node);
  }

  getChild(name: string): INode | undefined {
    return this.children.get(name);
  }

  hasChild(name: string): boolean {
    return this.children.has(name);
  }

  listNames(): string[] {
    return [...this.children.keys()].sort();
  }
}

// ---- FileSystem facade ----

class FileSystem implements IFileSystem {
  private readonly root: Directory = new Directory("");

  ls(path: string): string[] {
    const p = Path.parse(path);
    const node = this.traverse(p, /* createMissingDirs */ false);
    if (node.type === "file") {
      // LC 588 quirk: ls on a file returns just the file's name.
      return [node.name];
    }
    return (node as Directory).listNames();
  }

  mkdir(path: string): void {
    const p = Path.parse(path);
    if (p.isRoot()) return; // root always exists
    this.traverse(p, /* createMissingDirs */ true);
  }

  addContentToFile(path: string, content: string): void {
    const p = Path.parse(path);
    if (p.isRoot()) throw new Error("cannot write to root");

    // Walk to the parent directory, creating any missing dirs along the way.
    const parent = this.traverse(p.parent(), /* createMissingDirs */ true);
    if (parent.type !== "dir") {
      throw new Error(`parent of ${path} is not a directory`);
    }
    const parentDir = parent as Directory;
    const basename = p.last();
    const existing = parentDir.getChild(basename);

    if (existing === undefined) {
      const file = new File(basename);
      file.append(content);
      parentDir.addChild(file);
      return;
    }
    if (existing.type !== "file") {
      throw new Error(`${path} is a directory, cannot add content`);
    }
    (existing as File).append(content);
  }

  readContentFromFile(path: string): string {
    const p = Path.parse(path);
    if (p.isRoot()) throw new Error("root is not a file");
    const node = this.traverse(p, /* createMissingDirs */ false);
    if (node.type !== "file") {
      throw new Error(`${path} is a directory`);
    }
    return (node as File).read();
  }

  // ---- Core traversal ----
  //
  // This is where the Composite tree walk happens. Every public method
  // routes through here. The `createMissingDirs` flag toggles mkdir -p.
  //
  // Iterative, not recursive — for deep trees we don't want to blow
  // the call stack. See "Tree traversal" under Design Decisions.
  private traverse(path: Path, createMissingDirs: boolean): INode {
    let current: INode = this.root;

    for (let i = 0; i < path.components.length; i++) {
      const component = path.components[i];

      if (current.type !== "dir") {
        // Can't descend through a file.
        throw new Error(
          `cannot traverse into file at segment "${component}" of ${path.raw}`,
        );
      }
      const dir = current as Directory;
      let next = dir.getChild(component);

      if (next === undefined) {
        if (!createMissingDirs) {
          throw new Error(`path not found: ${path.raw}`);
        }
        next = new Directory(component);
        dir.addChild(next);
      }
      current = next;
    }
    return current;
  }
}

Notice how the four public methods are each 4–10 lines. The real logic lives in Path.parse and traverse. That's not an accident — it's what the Composite + facade split buys you. If you later add rm, you'll write it in a couple of lines too, because the walk is already there.

Walkthrough — how a single call flows

Trace addContentToFile("/docs/april/report.txt", "hello") on an empty file system, step by step:

  1. Path.parse("/docs/april/report.txt")Path { components: ["docs", "april", "report.txt"], raw: ... }.
  2. p.isRoot() is false, so we proceed.
  3. p.parent()Path { components: ["docs", "april"], ... }.
  4. traverse(parent, createMissingDirs=true):
    • current = root (Directory "").
    • Step "docs": root.getChild("docs") → undefined. Because createMissingDirs, create new Directory("docs"), root.addChild(it), current = it.
    • Step "april": same — create and descend.
    • Returns the Directory("april") we just created.
  5. parent.type === "dir" check passes. parentDir.getChild("report.txt") → undefined.
  6. Create new File("report.txt"), file.append("hello"), parentDir.addChild(file).

A second call addContentToFile("/docs/april/report.txt", " world"):

  1. Parse and walk as before; this time every directory exists, no creation.
  2. parentDir.getChild("report.txt") → the existing File.
  3. existing.type === "file" → call existing.append(" world").

Final content: "hello world". No full-path lookup table needed — the tree structure is the index.


Design Decisions & Tradeoffs

File content storage

RepresentationProsConsWhen to use
string (mutable via reassignment)Simple, idiomatic, reference-shared in JS. content += s is a one-liner.Repeated append creates a new string each time; O(N) per append for a long file. V8 sometimes uses ropes internally, but you can't rely on it.Demo, short files, interview solution.
Buffer / Uint8ArrayBinary-safe, fixed-width, explicit capacity management.Appending requires reallocating (or a second-level structure). Treats content as bytes, not text.Production, binary files, large files.
Chunked / block list (string[] or linked list of blocks)Append is O(1) amortized. Random access is O(chunks) not O(bytes). Plays nicely with persistence (each block flushes independently).More code. Reads reassemble. Seek/slice isn't free.Large files, future persistence backing.

For the interview: state the string choice, then show you know the Buffer + chunked story if the file gets large.

Children collection

StructureLookupInsertOrdered iterationMemory
Map<string, INode>O(1)O(1)Insertion order (sort on demand)One entry per child
Array<INode>O(N)O(1) append, O(N) sorted-insertNative if sorted on insertSlightly lower overhead
TreeMap / sorted structureO(log N)O(log N)O(N) in sorted orderHigher constants

We chose Map because the 99th-percentile operation is lookup during traversal, and lookups dominate in any non-toy workload. ls is rare relative to traversal, and a one-time sort at ls time is cheap (K log K where K is children in one directory).

Path representation

ApproachProsCons
Parsed list (string[])Traversal is a direct loop. Clean error messages per component.Allocation cost for every call.
Raw string with running indexZero allocation.Parser state leaks into traversal code. Slash-counting bugs are common.

Parsed list wins because path strings are small and the parser is a single point of validation. The allocation is noise next to the tree walk.

Tree traversal: recursive vs iterative

Recursive code for tree walks reads beautifully and matches the data structure. But a recursive mkdir -p /a/b/c/d/e/f/g/... 10,000 levels deep will blow the JavaScript call stack (default limit is ~10k frames). Real file systems allow paths up to PATH_MAX (4096 on Linux) which is fine for recursion, but if someone tests you with a pathological path you want to survive.

Iterative wins the interview because it demonstrates you thought about stack depth. The code is barely longer — one for loop — and it's exception-safe (no half-built trees from a stack overflow mid-creation).

Abstraction boundary: why FileSystem owns path parsing

A tempting alternative is to give Directory a method resolve(path: Path): INode that recurses through children. It "feels object-oriented." It's wrong for this design:

  • Directory shouldn't know about paths. A Directory is a collection of named children. Paths are a client-side concern — a way of expressing "navigate through a sequence of names." Pushing path knowledge into Directory couples the tree implementation to the parser.
  • Traversal policy varies by caller. ls wants "fail if missing"; mkdir wants "create if missing." If Directory.resolve has to know both modes, it grows flags and callers lose clarity. Putting the walk in FileSystem keeps the policy where it belongs — next to the public API that chose it.
  • Future: permission checks, quota checks. These are cross-cutting and naturally live at the FileSystem layer, once per traversal step. Shoving them into Directory would sprinkle concerns across every node.

The rule of thumb: Directory knows about children, FileSystem knows about paths. That's the abstraction boundary, and every piece of logic lands on one side or the other.


Patterns Used

PatternWhere it shows upWhy
CompositeINode abstract, File and Directory concrete. Directory.children holds INode.The whole reason polymorphic tree walks work. Adding Symlink later is a new concrete subtype, not a refactor.
FacadeFileSystem class wraps the tree and exposes the four-method API.Clients don't touch nodes directly. The tree is free to change (B-tree per directory? persistent nodes? COW?) without breaking callers.
IteratorDirectory.listNames(), and any recursive walk (e.g. find, du).Tree enumeration is the natural consumer of an iterator — generators in TS make this easy.
VisitorNot implemented above, but hinted at. Operations like du -s, find -name, checksum are cleanest as visitors that walk the tree and accumulate. Keeps new operations from piling methods onto INode.Extension point for cross-cutting operations (size totals, integrity checks, permission recalc).
CommandNot implemented above. Use for transactional multi-op work: mv is Copy + Delete, and if either fails the other must roll back.Atomicity for composite operations.
Template Method (subtle)traverse provides the walk skeleton; createMissingDirs flag is the variation point.Avoids three near-identical traversal implementations.

Concurrency Considerations

The core design is single-threaded. A senior interviewer will push. Here's the menu.

The core risks

  1. Lost writes. Two threads call addContentToFile("/a", "X") and addContentToFile("/a", "Y"). Depending on interleaving you can end with "XY", "YX", "X", or "Y".
  2. Torn reads. A reader sees a file mid-append.
  3. Dangling traversal. A reader walks the tree; a writer deletes a directory underneath them. The reader's cached parent pointer now points to a freed (garbage-collected) node — in a GC language the memory is safe, but the logical state is nonsense.
  4. Create races. Two threads both see mkdir("/a/b") as missing and both create it. Without coordination you end up with two separate Directory objects, one of them orphaned.

Lock strategies

StrategyGranularityProsConsWhen to use
Global FS mutexOne lock for the entire treeTrivial to reason about. Deadlock-free.Serializes everything. 10k concurrent readers queue behind one writer.Prototype, low contention, tests.
Per-directory lockOne lock per Directory nodeParallelism across unrelated subtrees. Natural lock boundary — mkdir locks the parent.Tree-spanning operations (mv) need multiple locks → deadlock potential. Readers still collide with concurrent writes.Moderate contention, mostly-separate subtrees.
Per-file RWLock (readers-writer)One RWLock per File, plus per-directory locks for structureMany concurrent readers. Writers serialize per file. Fine-grained and real-FS-like.Most complex. Lock-ordering discipline needed to avoid deadlock (mv holds two dir locks + one file lock).Production-grade. My default for this interview.
Snapshot / Copy-on-WriteNo locks on read — readers see an immutable snapshot; writes replace pointers atomicallyPerfect read scalability. No reader-writer interference. Great for backup/snapshots.High write cost (copy the affected path). Memory balloons during concurrent writes. Memory-reclamation is nontrivial (need epoch-based GC or refcounts).Read-heavy, snapshot semantics desired (e.g. ZFS, etcd).

Deadlock and recursive operations

mv /a/x /b/y is the classic hazard:

  • Holds lock on /a (to remove x).
  • Acquires lock on /b (to insert y).
  • Another thread does mv /b/z /a/w simultaneously: holds /b, wants /a. Deadlock.

Mitigation: total lock ordering. Always acquire directory locks in a fixed order (e.g. by path lexicographically, or by node identity / memory address). If you need /a and /b, lock the smaller-named one first. Every caller follows the same rule → no cycles.

Senior-level take

For an interview I'd recommend: per-file RWLock + per-directory lock + lexicographic lock ordering for multi-node operations. It matches the mental model a senior engineer has from dealing with real file systems (this is how ext4 and ZFS think about it), it explains itself cleanly, and it has a clear upgrade path to the snapshot model if read scaling becomes the bottleneck.

Concrete scenarios under per-file RWLock

  • Two concurrent reads of /a.txt — both threads take the read lock; RWLock allows N concurrent readers; no contention.
  • Read during write on /a.txt — reader blocks until writer finishes; observes the post-write content. No torn reads.
  • addContentToFile("/a.txt", "X") vs addContentToFile("/a.txt", "Y") — writers serialize on the file's write lock. Final content is "XY" or "YX" depending on arrival, but never interleaved at the character level.
  • mkdir("/x/y") vs mkdir("/x/z") — both acquire /x's directory write lock serially; each creates a different child. No conflict.
  • mkdir("/x/y") vs mkdir("/x/y") — first wins and creates; second acquires /x's write lock, sees the child now exists, no-ops.
  • mv /a/x /b/y — acquire locks on /a and /b in lexicographic order of path (/a first). Then the file's write lock. Perform the move. Release in reverse order.
  • Read iteration over ls("/a") while another thread does mkdir("/a/new")ls takes /a's read lock, snapshots the names, releases. mkdir waits for read lock to drop. No iterator invalidation.
typescript
// Sketch — not executable. Just to show the shape.
class File implements INode {
  private readonly lock = new ReadWriteLock();

  read(): string {
    return this.lock.withRead(() => this.content);
  }

  append(text: string): void {
    this.lock.withWrite(() => {
      this.content += text;
    });
  }
}

Scale & Extensibility

Large files

  • Store content as a list of chunks (string[] of 4KB blocks). Append is O(1); reads concat on demand or stream.
  • File.size becomes sum of chunk sizes — maintain incrementally on append.
  • Opens the door to lazy loading from a backing store: only page in chunks you're reading.

Persistence

  • Add a Journal that logs every mutation (mkdir, addContentToFile, delete, rename) as an append-only record.
  • On restart, replay the journal into an empty tree.
  • Compact the journal periodically by taking a snapshot (tree walk → serialized form) + journal-from-snapshot.
  • This is exactly how SQLite, LevelDB, and most KV stores handle durability.

Permissions (POSIX-style)

  • Add owner, group, mode to INode. PermissionChecker class consults these before every op.
  • mkdir inherits from parent (umask-style).
  • Implementing this without touching callers is a clean win for the Composite abstraction: the check goes in FileSystem.traverse, invisible to users.
  • New subtype: class Symlink implements INode with a target: Path.
  • traverse dereferences symlinks (with a max-hops counter to prevent loops: classic ELOOP on Unix is 40).
  • Lots of edge cases: symlink to a symlink, symlink to a directory we're listing, ls -L vs ls semantics (follow or show the link).

Quotas per user

  • QuotaManager tracks bytes used per owner. addContentToFile pre-checks quota, rejects with EDQUOT on overflow.
  • Requires the permissions model (you need "owner").

Journaling & crash recovery

  • Write-ahead log (WAL) every mutation before applying it.
  • On crash, replay WAL from last checkpoint.
  • For in-memory FS this is mostly relevant when you add persistence.

Snapshots / versioning

  • Make nodes persistent (COW): every mutation creates new nodes up the path to root; unchanged subtrees share.
  • Keep a map of snapshot-id → root node. Constant-time snapshots, O(changes) memory per snapshot.
  • This is how Git stores trees. Beautiful fit for this data model.

Beyond memory

  • Shard the tree: top-level directories routed to different nodes of a distributed service.
  • Consistency becomes the hard part — CAP tradeoffs, Paxos/Raft for metadata, etc.
  • This is the leap from "in-memory FS" to "distributed FS" (GFS, HDFS). Flag as out of scope unless the interviewer pushes.

Edge Cases

  1. Empty path "" — Reject in Path.parse. There's no "empty path" in a file system.
  2. Trailing slash /a/b/split + filter naturally handles this; "/a/b/".split("/").filter(Boolean) === ["a", "b"].
  3. Double slashes /a//b — Same filter handles it. Some systems treat // specially (Cygwin UNC paths); we don't.
  4. Root path "/"Path.parse("/").components === [], isRoot() === true. ls("/") lists root's children; mkdir("/") is a no-op; addContentToFile("/", ...) errors (can't write to root).
  5. Nonexistent path on ls — Throw. LC 588 guarantees existence, but production API must fail loud.
  6. Nonexistent path on mkdir — That's the entire point; create the chain.
  7. addContentToFile under a file/a/b exists as a file; someone calls addContentToFile("/a/b/c", ...). traverse hits the file at b and throws. We must fail before creating anything.
  8. mkdir where a file exists/a is a file; mkdir("/a") must error, not overwrite.
  9. ls on a file — Per LeetCode 588: returns a single-element array with the file's name, not its content, not an error. This is a spec quirk worth calling out aloud in the interview — it trips up most candidates.
  10. Re-adding contentaddContentToFile("/a", "hello") then addContentToFile("/a", " world") → content is "hello world". Append semantics, not overwrite.
  11. Deeply nested mkdir/a/b/c/.../z 10k deep. Iterative traversal survives; recursive would stack-overflow.
  12. Concurrent mkdir of the same path — In single-threaded core, second call is a no-op. In a threaded extension, this is the create-race in Concurrency Considerations.
  13. Reading from a directory — Error. Don't try to serialize listNames() into a fake "content."
  14. Case-sensitivity/A/b and /a/b are distinct. The Map<string, INode> respects this natively because JS string equality is case-sensitive.
  15. Unicode names — Should work since string is UTF-16 in JS. Noteworthy if the interviewer digs: normalization (NFC vs NFD) is a real bug source on macOS HFS+.
  16. Zero-length appendaddContentToFile("/a", "") still creates the file if missing. Consistent with spec.

Follow-up Questions

  1. How would you add symlinks? New INode subtype with a target: Path. Modify traverse to dereference, with a loop counter to catch cycles. Decide ls semantics: follow or show.
  2. How would you efficiently handle very large files? Chunked storage (list of fixed-size blocks), lazy loading from a backing store, streaming reads.
  3. How would you persist this to disk? Append-only journal for durability + periodic snapshots for compaction. On startup, replay journal from last snapshot.
  4. How would you add POSIX permissions? owner, group, mode on INode; PermissionChecker invoked by FileSystem.traverse. Discuss setuid / sticky bit as extensions.
  5. How would you implement rm -rf /a transactionally? Snapshot / COW: replace root's pointer to /a's parent with a new version that omits a. Old tree becomes unreachable (or reachable via a snapshot id). Either it happens fully or not at all.
  6. How would you snapshot the FS cheaply? Persistent / COW nodes. Taking a snapshot is just capturing the current root pointer. Mutations build new path-to-root. Structural sharing keeps memory overhead to O(changes per snapshot).
  7. How would you implement cp and mv? cp -r is a tree walk + clone each node. mv in the same parent directory is a rename (remove+insert under new name). Cross-directory mv is atomic remove from src + insert at dst — protect with two directory locks in lexicographic order.
  8. How does this compare to a Unix FS on inodes? Unix decouples file content (inode) from file names (directory entries). Multiple names can point to one inode → hard links. Our design fuses name and node, so hard links are impossible without refactoring to an inode-style indirection.
  9. How would this scale past one machine's memory? Shard by top-level directory. Add a metadata service (Paxos/Raft) for the namespace. Content goes to an object store. That's essentially GFS / HDFS.
  10. How would you add ls -R (recursive)? Visitor pattern. Depth-first traversal that yields (path, node) tuples. Use a generator for O(1) extra memory beyond the stack.
  11. How would you add find with predicates? Visitor again, with a predicate filter. Build pipelines: walk → filter by name → filter by size → collect.
  12. Race-free mkdir -p? Either global FS lock, or optimistic: walk, catch AlreadyExists as "another thread got there first, use their node," retry from the point of conflict.

SDE2 vs SDE3 — How the Bar Rises

DimensionSDE2 (mid)SDE3 (senior)
Composite understandingImplements File and Directory as separate classes; sees they share behavior but may not name the pattern.Names Composite upfront. Justifies INode as the abstraction. Anticipates Symlink/DeviceNode as future subtypes fitting the same hierarchy.
Path parsingHandles /, basic parsing via split. Misses trailing slash, double slash, or treats them as errors instead of normalizing.Handles //, trailing /, rejects empty / relative / ./.. with clean error messages. Extracts parsing into Path.parse as a single validation point.
TraversalRecursive, clean for shallow trees. Doesn't consider stack-depth pathology.Iterative walk; calls out the stack-depth concern explicitly. Unifies the walk into a single traverse(path, createMissingDirs) helper.
Error modelingThrows generic Error. Sometimes silently creates on ambiguous inputs.Distinguishes PathNotFound, NotADirectory, IsADirectory, PermissionDenied. Discusses matching POSIX errno semantics.
ls on a fileOften missed or returns content.Calls out the LC 588 quirk proactively: "ls on a file returns a single-element list with the file's name."
Concurrency"We could add a mutex."Names three lock strategies (global, per-directory, per-file RWLock) with tradeoffs. Discusses lock ordering to prevent deadlock. Mentions COW/snapshot as a fourth approach.
Extensibility seamsSays "we could add symlinks later."Points to INode as the extension point, walks through how traverse would dereference symlinks, mentions cycle detection via hop counter.
Large filesStores as a single string. Aware it's suboptimal.Proposes chunked storage from the start if files might be large. Ties chunked storage to persistence (each chunk flushes independently).
Persistence path"We could write to disk."Describes journal + periodic snapshot approach (WAL). Ties it to crash recovery and the typical DB-style playbook.
Recursion vs iteration awarenessDoesn't think about it.Pre-empts the stack-depth concern; iterative by default; mentions trampolining as an option.
Scope disciplineTries to implement symlinks and permissions inline when asked.Calls scope explicitly: "that's a one-afternoon extension, I'll keep it out of the core and sketch the seam."
Testing mindsetAsks for a test harness.Walks through edge cases unprompted: empty path, trailing slash, file-under-file, ls on a file, concurrent mkdir, deeply nested paths.

The SDE2 gets a working file system. The SDE3 gets a working file system and a convincing story about what happens next — symlinks, quotas, persistence, concurrency — because the abstractions were drawn with that future in mind from the first line of code.

Frontend interview preparation reference.