Files over 10 MB will slow most browser tools; over 50 MB will crash them. Here's how to handle large JSON without losing your mind.
JSON.parse() loads the entire file into memory before returning anything. For small files this is fine. For a 100MB database export, it will exhaust browser memory and crash the tab. This guide covers the practical strategies that actually work when JSON is too large to handle with standard tools.
Every standard JSON parser โ in browsers, Node.js, Python, and Java โ works by reading the entire input, building a complete in-memory representation, and then returning it. A 100MB JSON file can expand to 500MBโ1GB in memory because the in-memory object graph has overhead for each key, value, and reference. Browser tabs typically have 512MBโ2GB of memory limits and no swap. The result is a hard crash.
Rule of thumb:
jq is a lightweight command-line JSON processor. It streams through the file without loading everything into memory and lets you filter, transform, and extract data. Install it from jqlang.org.
# Pretty-print the first 10 array elements
jq '.[0:10]' large-data.json
# Get all keys of the root object
jq 'keys' large-data.json
# Count array items
jq 'length' large-data.json
# Extract a specific field from every object in an array
jq '.[].email' users.json
# Filter: only objects where age > 30
jq '[.[] | select(.age > 30)]' users.json
# Extract just the first item
jq '.[0]' large-array.json
# Get only specific fields from each object
jq '[.[] | {name, email}]' users.json
# Convert array of objects to CSV
jq -r '.[] | [.name, .email, .age | tostring] | join(",")' users.json
# Extract a single nested object
jq '.config.database' app-config.json > database-config.json
# Now paste database-config.json into the browser formatter
The stream-json npm package streams JSON token by token, so memory usage stays constant regardless of file size:
const { createReadStream } = require('fs');
const { chain } = require('stream-chain');
const { parser } = require('stream-json');
const { streamArray } = require('stream-json/streamers/StreamArray');
let count = 0;
chain([
createReadStream('large-users.json'),
parser(),
streamArray()
]).on('data', ({ value }) => {
// Process one object at a time โ constant memory
console.log(value.email);
count++;
}).on('end', () => {
console.log(`Processed ${count} records`);
});
ijson is the Python equivalent โ it iterates over items in a large JSON array without loading the whole file:
import ijson
with open('large-users.json', 'rb') as f:
for user in ijson.items(f, 'item'):
# 'item' = each element of the top-level array
print(user['email'])
# Process one record at a time
For deeply nested paths:
import ijson
with open('response.json', 'rb') as f:
# Extract items from data.users array
for user in ijson.items(f, 'data.users.item'):
process(user)
If you need to work with all the data in a browser tool, split the large array into smaller files first:
# Split a 100,000-item array into chunks of 1,000
jq -c '.[]' large.json | split -l 1000 - chunk_
# Then wrap each chunk back into an array
for f in chunk_*; do
jq -s '.' "$f" > "${f}.json"
done
import json
CHUNK_SIZE = 1000
with open('large.json') as f:
data = json.load(f) # Still loads all into memory once
for i in range(0, len(data), CHUNK_SIZE):
chunk = data[i:i + CHUNK_SIZE]
with open(f'chunk_{i // CHUNK_SIZE}.json', 'w') as out:
json.dump(chunk, out, indent=2)
print(f'Created {len(data) // CHUNK_SIZE + 1} chunk files')
For very large files (hundreds of MB or more) that you need to query repeatedly, importing into a database is the right long-term approach.
-- Load a JSON array into a SQLite table
-- Using the json_each() table-valued function
CREATE TABLE users AS
SELECT
json_extract(value, '$.id') AS id,
json_extract(value, '$.name') AS name,
json_extract(value, '$.email') AS email
FROM json_each(readfile('users.json'));
-- Import JSON file then query with operators
COPY staging(raw) FROM '/path/to/data.json';
SELECT raw->>'name', raw->>'email'
FROM staging
WHERE (raw->>'age')::int > 30;
If you control the data source, use JSONL format for large datasets. Each line is a valid, standalone JSON object. This enables line-by-line processing with standard Unix tools without any special parser:
# Each line is one JSON object:
{"id":1,"name":"Alice","email":"alice@example.com"}
{"id":2,"name":"Bob","email":"bob@example.com"}
# Count lines (= record count)
wc -l data.jsonl
# Extract a field from every line with jq
jq -r '.email' data.jsonl
# Filter with jq
jq 'select(.age > 30)' data.jsonl
# Process line by line in Python
with open('data.jsonl') as f:
for line in f:
record = json.loads(line.strip())
process(record)
| File Size | Best Approach |
|---|---|
| Under 5MB | Browser formatter (use this tool) |
| 5โ50MB | jq to extract the section you need, then browser tool |
| 50MBโ1GB | Streaming parser (stream-json / ijson) or jq |
| Over 1GB | Database import (SQLite or PostgreSQL) |
| Any size, repeated queries | Convert to JSONL or import to database |
A JSON file that is 100MB uncompressed often compresses to 10โ20MB with gzip, because JSON is highly repetitive text (repeated key names, whitespace, common values). Compression is almost always worth enabling for large JSON files transmitted over HTTP.
# Compress a JSON file with gzip
gzip -k large-data.json # creates large-data.json.gz (keeps original)
gzip -d large-data.json.gz # decompress
# Pipe jq output directly into gzip
jq '.[]' large-data.json | gzip > extracted.jsonl.gz
# Node.js โ stream with compression
const { createReadStream, createWriteStream } = require('fs');
const { createGzip } = require('zlib');
createReadStream('data.json')
.pipe(createGzip())
.pipe(createWriteStream('data.json.gz'))
.on('finish', () => console.log('Compressed'));
For HTTP APIs, enable gzip in your server and set Accept-Encoding: gzip on the client. Most frameworks (Express, FastAPI, Spring Boot) have a one-line setting to compress responses automatically.
If your script or browser tool is running out of memory when processing JSON, these steps help identify the problem:
ls -lh data.json (Linux/Mac) or check properties in Windows Explorer. Anything over 10MB should not go into a browser formatter.jq 'length' data.json to count top-level array items, or wc -l data.jsonl for JSONL files. This tells you what you are dealing with before you commit to loading everything.jq '.[0]' data.json. Understanding the structure first prevents wasted processing.node --max-old-space-size=4096 script.js to increase the V8 heap limit. If the process still crashes, you need a streaming approach.pip install memory-profiler then run mprof run script.py to see per-line memory usage.When large JSON files live in cloud storage (Amazon S3, Google Cloud Storage, Azure Blob), you have additional options that avoid downloading the whole file:
# Python: convert JSONL to Parquet using pandas
import pandas as pd
df = pd.read_json('large-data.jsonl', lines=True)
df.to_parquet('data.parquet', compression='snappy')
# Later: read back only the columns you need
df2 = pd.read_parquet('data.parquet', columns=['name', 'email', 'age'])
Use the JSON Formatter Hub to format, validate, and fix your JSON right now โ free and fully browser-based.