15 — Problem: In-Memory File System
Understanding the Problem
An in-memory file system models the tree of files and directories you'd find on disk, but the whole thing lives in process memory. No inodes, no block devices, no page cache — just a tree of nodes rooted at /, with operations to list a directory, create a directory, append content to a file, and read a file. LeetCode 588 and 1166 are the canonical versions. In an interview setting this problem is less about the algorithm and more about whether you can name the right pattern, draw a clean abstraction boundary, and hold the conversation when the interviewer pushes toward symlinks, permissions, or concurrent writers.
What is an In-Memory File System?
A tree where every node is either a file (a leaf holding content) or a directory (an internal node holding named children). Clients interact with it through string paths — /a/b/c — which the file system parses into a list of components and walks. This is the textbook Composite pattern: File and Directory share an abstract Node interface so traversal, listing, and future operations (delete, move, stat) can treat them uniformly. Interviewers use this problem to probe tree modeling, path parsing edge cases, abstraction boundaries (where does the parser end and the tree begin?), and extensibility seams for symlinks, permissions, and persistence. Get the Node hierarchy right and the rest of the design falls out.
Clarifying Questions
You: What's the path separator, and is there a root?
Interviewer: Unix-style.
/is the separator;/is the root. All paths start with/.
You: Absolute paths only, or do we support relative paths and a current working directory?
Interviewer: Absolute only. No
cwd, no., no...
You: Case-sensitive?
Interviewer: Yes.
/Aand/aare different.
You: Are there size limits on files? Max depth of the tree?
Interviewer: No hard limits for the core design. Assume files fit in memory. Mention how you'd bound things in production.
You: Permissions — owners, modes, ACLs?
Interviewer: Out of scope for the core API. Discuss as a follow-up.
You: Symlinks? Hard links?
Interviewer: Out of scope. Definitely discuss as a follow-up — it's a good extensibility question.
You: Move, copy, delete?
Interviewer: Not in the core API. Implement
ls,mkdir,addContentToFile,readContentFromFile. Talk aboutrmandmvas follow-ups.
You: What are the exact semantics of
lson a file vs a directory?
Interviewer: On a directory, return a sorted list of children names. On a file, return a single-element list containing just the file name. This matches LeetCode 588.
You: What does
addContentToFiledo if the file already exists? If the parent directory doesn't exist?
Interviewer: Append if the file exists. Create the file (and any missing parent directories) if it doesn't. Think
mkdir -psemantics for the parents.
You: Concurrency — single caller or multi-threaded?
Interviewer: Single-threaded for the core. Discuss concurrency as a follow-up.
You: Atomic writes — if
addContentToFileis mid-write and another reader comes in, what do they see?
Interviewer: Single-threaded, so not an issue for the core. But yes, discuss when we get to concurrency.
Functional Requirements
ls(path: string): string[]— List the contents of a path.- If
pathis a directory, return children names sorted lexicographically. - If
pathis a file, return a single-element array with just the file's name (not its content). - If
pathdoesn't exist, throw / error. (LC 588 guarantees the path exists, but a production API should fail loudly.)
- If
mkdir(path: string): void— Create a directory, and any missing parent directories (mkdir -p).- Idempotent: creating a directory that already exists is a no-op.
- Error if any component of the path is an existing file.
addContentToFile(path: string, content: string): void— Write to a file.- Create the file if it doesn't exist, creating parent directories as needed.
- Append
contentto the file if it already exists. - Error if
pathresolves to a directory.
readContentFromFile(path: string): string— Return the full content of a file.- Error if
pathdoesn't exist or resolves to a directory.
- Error if
Out of scope: rm, mv, cp, permissions, symlinks, persistence, concurrent access (all discussed as follow-ups).
Non-Functional Requirements
| Dimension | Target | Reasoning |
|---|---|---|
| Read latency | O(D) for traversal + O(K log K) for ls sort, where D = depth, K = children. Content reads are O(N) in file size. | Tree walk is unavoidable; sort on ls is a spec requirement. |
| Write latency | O(D) for path traversal + O(C) append for content, where C = appended length. | mkdir -p traversal cost is dominated by tree depth, not breadth. |
| Memory footprint | O(total path chars + total content bytes). Per-node overhead dominated by children Map. | A large flat directory with 10k files holds 10k Map entries in its parent. |
| Concurrency | Core: single-threaded. Extension: per-file RWLock + lock-ordering for multi-file operations. | Covered in Concurrency Considerations below. |
| Extensibility | Must support symlinks, ACLs, quotas, and a disk-backed store with localized changes — ideally at the Node level. | The Composite hierarchy is the extension surface. |
| Error behavior | Fail loudly on bad paths, type mismatches (file-under-file), and read-from-directory. | Silent failure in a file system is catastrophic. |
Core Entities and Relationships
| Entity | Role | Key State |
|---|---|---|
INode (abstract) | Base type for anything in the tree. Enables Composite uniformity. | name, type ("file" | "dir"), size |
File (concrete INode) | Leaf node. Holds content. | content: string |
Directory (concrete INode) | Internal node. Holds named children. | children: Map<string, INode> |
Path | Parsed path — a sequence of components, plus absolute/root metadata. | components: string[] |
FileSystem | Facade. Owns the root directory, parses paths, routes operations. | root: Directory |
Why Composite? ls doesn't care whether the thing at a path is a file or a directory — it just needs name, and polymorphic listContents(). mkdir -p walks the tree descending through Directory nodes and stopping if it hits a File. Treating File and Directory as sibling subtypes of a common INode lets every traversal be a single polymorphic walk. Without Composite you'd smear if (isFile) ... else ... checks across every operation — exactly the duplication the pattern exists to prevent.
Why keep Path separate? Path parsing is pure string manipulation. Tree traversal is pointer chasing. If you mix them, every operation has to re-parse on every call and the traversal code fills with string-slicing noise. Separating them means FileSystem.ls() reads as "parse, walk, list" — three lines instead of fifteen.
Interfaces
type NodeType = "file" | "dir";
interface INode {
readonly name: string;
readonly type: NodeType;
readonly size: number; // bytes for files; number of children for dirs
}
interface IPath {
readonly components: string[]; // [] means root "/"
readonly raw: string;
isRoot(): boolean;
parent(): IPath;
last(): string;
join(component: string): IPath;
}
interface IFileSystem {
ls(path: string): string[];
mkdir(path: string): void;
addContentToFile(path: string, content: string): void;
readContentFromFile(path: string): string;
}The INode interface is intentionally narrow — just enough for polymorphic reads. Concrete classes expose mutation operations (addContent, addChild) but clients only ever touch the IFileSystem facade, never INode directly. That boundary is what lets us add symlinks later without breaking callers.
Class Diagram
┌──────────────────────┐
│ <<interface>> │
│ INode │
│ + name: string │
│ + type: NodeType │
│ + size: number │
└──────────┬───────────┘
│
┌───────────────┴───────────────┐
│ │
┌────────▼─────────┐ ┌─────────▼──────────┐
│ File │ │ Directory │
│──────────────────│ │────────────────────│
│ - content: str │ │ - children: │
│ + append(s) │ │ Map<str, INode> │
│ + read(): str │ │ + addChild(n) │
└──────────────────┘ │ + getChild(n) │
│ + listNames() │
└────────┬───────────┘
│ 0..*
│ (holds INode)
▼
back-edge to INode
┌──────────────────────────────────┐
│ FileSystem │
│──────────────────────────────────│
│ - root: Directory │
│──────────────────────────────────│
│ + ls(path): string[] │
│ + mkdir(path): void │
│ + addContentToFile(path, c) │
│ + readContentFromFile(path): str │
│ - parse(path): Path │
│ - traverse(path, createDirs): N │
└──────────────────────────────────┘
│
│ owns
▼
Directory (root)
│
│ children (Composite)
▼
File │ Directory │ ...The core insight: the recursion structure lives in Directory.children holding INode — a directory's children are nodes, and a node might itself be a directory. That's the whole Composite pattern.
Class Design
File
State:
name: string— just the basename, e.g."report.txt".content: string— the raw content. Starts empty.
Methods:
append(text: string): void— append to content. O(C) in appended length.read(): string— return content. O(1) reference return (string is reference-shared in JS).size: number(getter) —content.length.
The File class doesn't know its own path. Path context is the FileSystem's job. If we stored parent pointers we'd need to update them on every move, and we'd break the simple uniform tree invariant.
Directory
State:
name: stringchildren: Map<string, INode>— name → child node. Map because we need O(1) lookup during traversal; children names must be unique.
Methods:
addChild(node: INode): void— throws if a child with that name already exists with a conflicting type.getChild(name: string): INode | undefined— O(1) lookup.hasChild(name: string): booleanlistNames(): string[]— sorted children names, forls.size: number(getter) —children.size.
Why Map instead of an Array? Traversal does a lookup on every path component. With 10k children in a directory, Array.find is O(N) per step — a deep path in a wide tree turns into a quadratic disaster. Map gives O(1). We pay ordering in exchange, but we only need ordering at ls time, which is a one-shot sort.
Path
State:
components: string[]—["a", "b", "c"]for/a/b/c. Empty for/.raw: string— original input, useful for error messages.
Methods:
isRoot(): boolean—components.length === 0.parent(): Path— drop the last component.last(): string— the final component (basename).join(component: string): Path— append a component.
Parsing is a static factory: Path.parse("/a/b/c"). It handles trailing slashes, collapses double slashes, and rejects empty / non-absolute input.
FileSystem
State:
root: Directory— singleton root named"".
Methods:
ls,mkdir,addContentToFile,readContentFromFile— the public API.traverse(path: Path, createMissingDirs: boolean): INode— private helper. This is where the Composite tree walk lives.createMissingDirstogglesmkdir -psemantics.
The traverse helper is the single point of path-walking logic. Every public method ends up calling it, differing only in the createMissingDirs flag and what they do with the result. That's how you avoid bugs: one walk, many callers.
Key Methods
// ---- Path parsing ----
class Path implements IPath {
constructor(
public readonly components: string[],
public readonly raw: string,
) {}
static parse(raw: string): Path {
if (typeof raw !== "string" || raw.length === 0) {
throw new Error(`invalid path: ${JSON.stringify(raw)}`);
}
if (raw[0] !== "/") {
throw new Error(`path must be absolute: ${raw}`);
}
// Split and filter empty segments — this collapses "//" and trailing "/".
const components = raw.split("/").filter((c) => c.length > 0);
for (const c of components) {
if (c === "." || c === "..") {
throw new Error(`path component ${c} not supported`);
}
}
return new Path(components, raw);
}
isRoot(): boolean {
return this.components.length === 0;
}
parent(): Path {
if (this.isRoot()) throw new Error("root has no parent");
return new Path(this.components.slice(0, -1), this.raw);
}
last(): string {
if (this.isRoot()) throw new Error("root has no basename");
return this.components[this.components.length - 1];
}
join(component: string): Path {
return new Path([...this.components, component], `${this.raw}/${component}`);
}
}
// ---- Nodes ----
class File implements INode {
readonly type = "file" as const;
private content: string = "";
constructor(public readonly name: string) {}
get size(): number {
return this.content.length;
}
append(text: string): void {
this.content += text;
}
read(): string {
return this.content;
}
}
class Directory implements INode {
readonly type = "dir" as const;
private readonly children: Map<string, INode> = new Map();
constructor(public readonly name: string) {}
get size(): number {
return this.children.size;
}
addChild(node: INode): void {
const existing = this.children.get(node.name);
if (existing && existing.type !== node.type) {
throw new Error(
`cannot add ${node.type} "${node.name}": a ${existing.type} with that name exists`,
);
}
if (!existing) this.children.set(node.name, node);
}
getChild(name: string): INode | undefined {
return this.children.get(name);
}
hasChild(name: string): boolean {
return this.children.has(name);
}
listNames(): string[] {
return [...this.children.keys()].sort();
}
}
// ---- FileSystem facade ----
class FileSystem implements IFileSystem {
private readonly root: Directory = new Directory("");
ls(path: string): string[] {
const p = Path.parse(path);
const node = this.traverse(p, /* createMissingDirs */ false);
if (node.type === "file") {
// LC 588 quirk: ls on a file returns just the file's name.
return [node.name];
}
return (node as Directory).listNames();
}
mkdir(path: string): void {
const p = Path.parse(path);
if (p.isRoot()) return; // root always exists
this.traverse(p, /* createMissingDirs */ true);
}
addContentToFile(path: string, content: string): void {
const p = Path.parse(path);
if (p.isRoot()) throw new Error("cannot write to root");
// Walk to the parent directory, creating any missing dirs along the way.
const parent = this.traverse(p.parent(), /* createMissingDirs */ true);
if (parent.type !== "dir") {
throw new Error(`parent of ${path} is not a directory`);
}
const parentDir = parent as Directory;
const basename = p.last();
const existing = parentDir.getChild(basename);
if (existing === undefined) {
const file = new File(basename);
file.append(content);
parentDir.addChild(file);
return;
}
if (existing.type !== "file") {
throw new Error(`${path} is a directory, cannot add content`);
}
(existing as File).append(content);
}
readContentFromFile(path: string): string {
const p = Path.parse(path);
if (p.isRoot()) throw new Error("root is not a file");
const node = this.traverse(p, /* createMissingDirs */ false);
if (node.type !== "file") {
throw new Error(`${path} is a directory`);
}
return (node as File).read();
}
// ---- Core traversal ----
//
// This is where the Composite tree walk happens. Every public method
// routes through here. The `createMissingDirs` flag toggles mkdir -p.
//
// Iterative, not recursive — for deep trees we don't want to blow
// the call stack. See "Tree traversal" under Design Decisions.
private traverse(path: Path, createMissingDirs: boolean): INode {
let current: INode = this.root;
for (let i = 0; i < path.components.length; i++) {
const component = path.components[i];
if (current.type !== "dir") {
// Can't descend through a file.
throw new Error(
`cannot traverse into file at segment "${component}" of ${path.raw}`,
);
}
const dir = current as Directory;
let next = dir.getChild(component);
if (next === undefined) {
if (!createMissingDirs) {
throw new Error(`path not found: ${path.raw}`);
}
next = new Directory(component);
dir.addChild(next);
}
current = next;
}
return current;
}
}Notice how the four public methods are each 4–10 lines. The real logic lives in Path.parse and traverse. That's not an accident — it's what the Composite + facade split buys you. If you later add rm, you'll write it in a couple of lines too, because the walk is already there.
Walkthrough — how a single call flows
Trace addContentToFile("/docs/april/report.txt", "hello") on an empty file system, step by step:
Path.parse("/docs/april/report.txt")→Path { components: ["docs", "april", "report.txt"], raw: ... }.p.isRoot()is false, so we proceed.p.parent()→Path { components: ["docs", "april"], ... }.traverse(parent, createMissingDirs=true):current = root(Directory"").- Step
"docs":root.getChild("docs")→ undefined. BecausecreateMissingDirs, createnew Directory("docs"),root.addChild(it),current = it. - Step
"april": same — create and descend. - Returns the
Directory("april")we just created.
parent.type === "dir"check passes.parentDir.getChild("report.txt")→ undefined.- Create
new File("report.txt"),file.append("hello"),parentDir.addChild(file).
A second call addContentToFile("/docs/april/report.txt", " world"):
- Parse and walk as before; this time every directory exists, no creation.
parentDir.getChild("report.txt")→ the existingFile.existing.type === "file"→ callexisting.append(" world").
Final content: "hello world". No full-path lookup table needed — the tree structure is the index.
Design Decisions & Tradeoffs
File content storage
| Representation | Pros | Cons | When to use |
|---|---|---|---|
string (mutable via reassignment) | Simple, idiomatic, reference-shared in JS. content += s is a one-liner. | Repeated append creates a new string each time; O(N) per append for a long file. V8 sometimes uses ropes internally, but you can't rely on it. | Demo, short files, interview solution. |
Buffer / Uint8Array | Binary-safe, fixed-width, explicit capacity management. | Appending requires reallocating (or a second-level structure). Treats content as bytes, not text. | Production, binary files, large files. |
Chunked / block list (string[] or linked list of blocks) | Append is O(1) amortized. Random access is O(chunks) not O(bytes). Plays nicely with persistence (each block flushes independently). | More code. Reads reassemble. Seek/slice isn't free. | Large files, future persistence backing. |
For the interview: state the string choice, then show you know the Buffer + chunked story if the file gets large.
Children collection
| Structure | Lookup | Insert | Ordered iteration | Memory |
|---|---|---|---|---|
Map<string, INode> | O(1) | O(1) | Insertion order (sort on demand) | One entry per child |
Array<INode> | O(N) | O(1) append, O(N) sorted-insert | Native if sorted on insert | Slightly lower overhead |
TreeMap / sorted structure | O(log N) | O(log N) | O(N) in sorted order | Higher constants |
We chose Map because the 99th-percentile operation is lookup during traversal, and lookups dominate in any non-toy workload. ls is rare relative to traversal, and a one-time sort at ls time is cheap (K log K where K is children in one directory).
Path representation
| Approach | Pros | Cons |
|---|---|---|
Parsed list (string[]) | Traversal is a direct loop. Clean error messages per component. | Allocation cost for every call. |
| Raw string with running index | Zero allocation. | Parser state leaks into traversal code. Slash-counting bugs are common. |
Parsed list wins because path strings are small and the parser is a single point of validation. The allocation is noise next to the tree walk.
Tree traversal: recursive vs iterative
Recursive code for tree walks reads beautifully and matches the data structure. But a recursive mkdir -p /a/b/c/d/e/f/g/... 10,000 levels deep will blow the JavaScript call stack (default limit is ~10k frames). Real file systems allow paths up to PATH_MAX (4096 on Linux) which is fine for recursion, but if someone tests you with a pathological path you want to survive.
Iterative wins the interview because it demonstrates you thought about stack depth. The code is barely longer — one for loop — and it's exception-safe (no half-built trees from a stack overflow mid-creation).
Abstraction boundary: why FileSystem owns path parsing
A tempting alternative is to give Directory a method resolve(path: Path): INode that recurses through children. It "feels object-oriented." It's wrong for this design:
- Directory shouldn't know about paths. A
Directoryis a collection of named children. Paths are a client-side concern — a way of expressing "navigate through a sequence of names." Pushing path knowledge intoDirectorycouples the tree implementation to the parser. - Traversal policy varies by caller.
lswants "fail if missing";mkdirwants "create if missing." IfDirectory.resolvehas to know both modes, it grows flags and callers lose clarity. Putting the walk inFileSystemkeeps the policy where it belongs — next to the public API that chose it. - Future: permission checks, quota checks. These are cross-cutting and naturally live at the
FileSystemlayer, once per traversal step. Shoving them intoDirectorywould sprinkle concerns across every node.
The rule of thumb: Directory knows about children, FileSystem knows about paths. That's the abstraction boundary, and every piece of logic lands on one side or the other.
Patterns Used
| Pattern | Where it shows up | Why |
|---|---|---|
| Composite | INode abstract, File and Directory concrete. Directory.children holds INode. | The whole reason polymorphic tree walks work. Adding Symlink later is a new concrete subtype, not a refactor. |
| Facade | FileSystem class wraps the tree and exposes the four-method API. | Clients don't touch nodes directly. The tree is free to change (B-tree per directory? persistent nodes? COW?) without breaking callers. |
| Iterator | Directory.listNames(), and any recursive walk (e.g. find, du). | Tree enumeration is the natural consumer of an iterator — generators in TS make this easy. |
| Visitor | Not implemented above, but hinted at. Operations like du -s, find -name, checksum are cleanest as visitors that walk the tree and accumulate. Keeps new operations from piling methods onto INode. | Extension point for cross-cutting operations (size totals, integrity checks, permission recalc). |
| Command | Not implemented above. Use for transactional multi-op work: mv is Copy + Delete, and if either fails the other must roll back. | Atomicity for composite operations. |
| Template Method (subtle) | traverse provides the walk skeleton; createMissingDirs flag is the variation point. | Avoids three near-identical traversal implementations. |
Concurrency Considerations
The core design is single-threaded. A senior interviewer will push. Here's the menu.
The core risks
- Lost writes. Two threads call
addContentToFile("/a", "X")andaddContentToFile("/a", "Y"). Depending on interleaving you can end with"XY","YX","X", or"Y". - Torn reads. A reader sees a file mid-append.
- Dangling traversal. A reader walks the tree; a writer deletes a directory underneath them. The reader's cached parent pointer now points to a freed (garbage-collected) node — in a GC language the memory is safe, but the logical state is nonsense.
- Create races. Two threads both see
mkdir("/a/b")as missing and both create it. Without coordination you end up with two separateDirectoryobjects, one of them orphaned.
Lock strategies
| Strategy | Granularity | Pros | Cons | When to use |
|---|---|---|---|---|
| Global FS mutex | One lock for the entire tree | Trivial to reason about. Deadlock-free. | Serializes everything. 10k concurrent readers queue behind one writer. | Prototype, low contention, tests. |
| Per-directory lock | One lock per Directory node | Parallelism across unrelated subtrees. Natural lock boundary — mkdir locks the parent. | Tree-spanning operations (mv) need multiple locks → deadlock potential. Readers still collide with concurrent writes. | Moderate contention, mostly-separate subtrees. |
| Per-file RWLock (readers-writer) | One RWLock per File, plus per-directory locks for structure | Many concurrent readers. Writers serialize per file. Fine-grained and real-FS-like. | Most complex. Lock-ordering discipline needed to avoid deadlock (mv holds two dir locks + one file lock). | Production-grade. My default for this interview. |
| Snapshot / Copy-on-Write | No locks on read — readers see an immutable snapshot; writes replace pointers atomically | Perfect read scalability. No reader-writer interference. Great for backup/snapshots. | High write cost (copy the affected path). Memory balloons during concurrent writes. Memory-reclamation is nontrivial (need epoch-based GC or refcounts). | Read-heavy, snapshot semantics desired (e.g. ZFS, etcd). |
Deadlock and recursive operations
mv /a/x /b/y is the classic hazard:
- Holds lock on
/a(to removex). - Acquires lock on
/b(to inserty). - Another thread does
mv /b/z /a/wsimultaneously: holds/b, wants/a. Deadlock.
Mitigation: total lock ordering. Always acquire directory locks in a fixed order (e.g. by path lexicographically, or by node identity / memory address). If you need /a and /b, lock the smaller-named one first. Every caller follows the same rule → no cycles.
Senior-level take
For an interview I'd recommend: per-file RWLock + per-directory lock + lexicographic lock ordering for multi-node operations. It matches the mental model a senior engineer has from dealing with real file systems (this is how ext4 and ZFS think about it), it explains itself cleanly, and it has a clear upgrade path to the snapshot model if read scaling becomes the bottleneck.
Concrete scenarios under per-file RWLock
- Two concurrent reads of
/a.txt— both threads take the read lock; RWLock allows N concurrent readers; no contention. - Read during write on
/a.txt— reader blocks until writer finishes; observes the post-write content. No torn reads. addContentToFile("/a.txt", "X")vsaddContentToFile("/a.txt", "Y")— writers serialize on the file's write lock. Final content is"XY"or"YX"depending on arrival, but never interleaved at the character level.mkdir("/x/y")vsmkdir("/x/z")— both acquire/x's directory write lock serially; each creates a different child. No conflict.mkdir("/x/y")vsmkdir("/x/y")— first wins and creates; second acquires/x's write lock, sees the child now exists, no-ops.mv /a/x /b/y— acquire locks on/aand/bin lexicographic order of path (/afirst). Then the file's write lock. Perform the move. Release in reverse order.- Read iteration over
ls("/a")while another thread doesmkdir("/a/new")—lstakes/a's read lock, snapshots the names, releases.mkdirwaits for read lock to drop. No iterator invalidation.
// Sketch — not executable. Just to show the shape.
class File implements INode {
private readonly lock = new ReadWriteLock();
read(): string {
return this.lock.withRead(() => this.content);
}
append(text: string): void {
this.lock.withWrite(() => {
this.content += text;
});
}
}Scale & Extensibility
Large files
- Store
contentas a list of chunks (string[]of 4KB blocks). Append is O(1); reads concat on demand or stream. File.sizebecomes sum of chunk sizes — maintain incrementally onappend.- Opens the door to lazy loading from a backing store: only page in chunks you're reading.
Persistence
- Add a
Journalthat logs every mutation (mkdir,addContentToFile,delete,rename) as an append-only record. - On restart, replay the journal into an empty tree.
- Compact the journal periodically by taking a snapshot (tree walk → serialized form) + journal-from-snapshot.
- This is exactly how SQLite, LevelDB, and most KV stores handle durability.
Permissions (POSIX-style)
- Add
owner,group,modetoINode.PermissionCheckerclass consults these before every op. mkdirinherits from parent (umask-style).- Implementing this without touching callers is a clean win for the Composite abstraction: the check goes in
FileSystem.traverse, invisible to users.
Symlinks
- New subtype:
class Symlink implements INodewith atarget: Path. traversedereferences symlinks (with a max-hops counter to prevent loops: classic ELOOP on Unix is 40).- Lots of edge cases: symlink to a symlink, symlink to a directory we're listing,
ls -Lvslssemantics (follow or show the link).
Quotas per user
QuotaManagertracks bytes used per owner.addContentToFilepre-checks quota, rejects withEDQUOTon overflow.- Requires the permissions model (you need "owner").
Journaling & crash recovery
- Write-ahead log (WAL) every mutation before applying it.
- On crash, replay WAL from last checkpoint.
- For in-memory FS this is mostly relevant when you add persistence.
Snapshots / versioning
- Make nodes persistent (COW): every mutation creates new nodes up the path to root; unchanged subtrees share.
- Keep a map of snapshot-id → root node. Constant-time snapshots, O(changes) memory per snapshot.
- This is how Git stores trees. Beautiful fit for this data model.
Beyond memory
- Shard the tree: top-level directories routed to different nodes of a distributed service.
- Consistency becomes the hard part — CAP tradeoffs, Paxos/Raft for metadata, etc.
- This is the leap from "in-memory FS" to "distributed FS" (GFS, HDFS). Flag as out of scope unless the interviewer pushes.
Edge Cases
- Empty path
""— Reject inPath.parse. There's no "empty path" in a file system. - Trailing slash
/a/b/—split + filternaturally handles this;"/a/b/".split("/").filter(Boolean) === ["a", "b"]. - Double slashes
/a//b— Same filter handles it. Some systems treat//specially (Cygwin UNC paths); we don't. - Root path
"/"—Path.parse("/").components === [],isRoot() === true.ls("/")lists root's children;mkdir("/")is a no-op;addContentToFile("/", ...)errors (can't write to root). - Nonexistent path on
ls— Throw. LC 588 guarantees existence, but production API must fail loud. - Nonexistent path on
mkdir— That's the entire point; create the chain. addContentToFileunder a file —/a/bexists as a file; someone callsaddContentToFile("/a/b/c", ...).traversehits the file atband throws. We must fail before creating anything.mkdirwhere a file exists —/ais a file;mkdir("/a")must error, not overwrite.lson a file — Per LeetCode 588: returns a single-element array with the file's name, not its content, not an error. This is a spec quirk worth calling out aloud in the interview — it trips up most candidates.- Re-adding content —
addContentToFile("/a", "hello")thenaddContentToFile("/a", " world")→ content is"hello world". Append semantics, not overwrite. - Deeply nested
mkdir—/a/b/c/.../z10k deep. Iterative traversal survives; recursive would stack-overflow. - Concurrent
mkdirof the same path — In single-threaded core, second call is a no-op. In a threaded extension, this is the create-race in Concurrency Considerations. - Reading from a directory — Error. Don't try to serialize
listNames()into a fake "content." - Case-sensitivity —
/A/band/a/bare distinct. TheMap<string, INode>respects this natively because JS string equality is case-sensitive. - Unicode names — Should work since
stringis UTF-16 in JS. Noteworthy if the interviewer digs: normalization (NFC vs NFD) is a real bug source on macOS HFS+. - Zero-length append —
addContentToFile("/a", "")still creates the file if missing. Consistent with spec.
Follow-up Questions
- How would you add symlinks? New
INodesubtype with atarget: Path. Modifytraverseto dereference, with a loop counter to catch cycles. Decidelssemantics: follow or show. - How would you efficiently handle very large files? Chunked storage (list of fixed-size blocks), lazy loading from a backing store, streaming reads.
- How would you persist this to disk? Append-only journal for durability + periodic snapshots for compaction. On startup, replay journal from last snapshot.
- How would you add POSIX permissions?
owner, group, modeonINode;PermissionCheckerinvoked byFileSystem.traverse. Discuss setuid / sticky bit as extensions. - How would you implement
rm -rf /atransactionally? Snapshot / COW: replace root's pointer to/a's parent with a new version that omitsa. Old tree becomes unreachable (or reachable via a snapshot id). Either it happens fully or not at all. - How would you snapshot the FS cheaply? Persistent / COW nodes. Taking a snapshot is just capturing the current root pointer. Mutations build new path-to-root. Structural sharing keeps memory overhead to O(changes per snapshot).
- How would you implement
cpandmv?cp -ris a tree walk + clone each node.mvin the same parent directory is a rename (remove+insert under new name). Cross-directorymvis atomic remove from src + insert at dst — protect with two directory locks in lexicographic order. - How does this compare to a Unix FS on inodes? Unix decouples file content (inode) from file names (directory entries). Multiple names can point to one inode → hard links. Our design fuses name and node, so hard links are impossible without refactoring to an
inode-style indirection. - How would this scale past one machine's memory? Shard by top-level directory. Add a metadata service (Paxos/Raft) for the namespace. Content goes to an object store. That's essentially GFS / HDFS.
- How would you add
ls -R(recursive)? Visitor pattern. Depth-first traversal that yields(path, node)tuples. Use a generator for O(1) extra memory beyond the stack. - How would you add
findwith predicates? Visitor again, with a predicate filter. Build pipelines:walk → filter by name → filter by size → collect. - Race-free
mkdir -p? Either global FS lock, or optimistic: walk, catchAlreadyExistsas "another thread got there first, use their node," retry from the point of conflict.
SDE2 vs SDE3 — How the Bar Rises
| Dimension | SDE2 (mid) | SDE3 (senior) |
|---|---|---|
| Composite understanding | Implements File and Directory as separate classes; sees they share behavior but may not name the pattern. | Names Composite upfront. Justifies INode as the abstraction. Anticipates Symlink/DeviceNode as future subtypes fitting the same hierarchy. |
| Path parsing | Handles /, basic parsing via split. Misses trailing slash, double slash, or treats them as errors instead of normalizing. | Handles //, trailing /, rejects empty / relative / ./.. with clean error messages. Extracts parsing into Path.parse as a single validation point. |
| Traversal | Recursive, clean for shallow trees. Doesn't consider stack-depth pathology. | Iterative walk; calls out the stack-depth concern explicitly. Unifies the walk into a single traverse(path, createMissingDirs) helper. |
| Error modeling | Throws generic Error. Sometimes silently creates on ambiguous inputs. | Distinguishes PathNotFound, NotADirectory, IsADirectory, PermissionDenied. Discusses matching POSIX errno semantics. |
ls on a file | Often missed or returns content. | Calls out the LC 588 quirk proactively: "ls on a file returns a single-element list with the file's name." |
| Concurrency | "We could add a mutex." | Names three lock strategies (global, per-directory, per-file RWLock) with tradeoffs. Discusses lock ordering to prevent deadlock. Mentions COW/snapshot as a fourth approach. |
| Extensibility seams | Says "we could add symlinks later." | Points to INode as the extension point, walks through how traverse would dereference symlinks, mentions cycle detection via hop counter. |
| Large files | Stores as a single string. Aware it's suboptimal. | Proposes chunked storage from the start if files might be large. Ties chunked storage to persistence (each chunk flushes independently). |
| Persistence path | "We could write to disk." | Describes journal + periodic snapshot approach (WAL). Ties it to crash recovery and the typical DB-style playbook. |
| Recursion vs iteration awareness | Doesn't think about it. | Pre-empts the stack-depth concern; iterative by default; mentions trampolining as an option. |
| Scope discipline | Tries to implement symlinks and permissions inline when asked. | Calls scope explicitly: "that's a one-afternoon extension, I'll keep it out of the core and sketch the seam." |
| Testing mindset | Asks for a test harness. | Walks through edge cases unprompted: empty path, trailing slash, file-under-file, ls on a file, concurrent mkdir, deeply nested paths. |
The SDE2 gets a working file system. The SDE3 gets a working file system and a convincing story about what happens next — symlinks, quotas, persistence, concurrency — because the abstractions were drawn with that future in mind from the first line of code.