Savvi Studio

Cursor Performance Optimization

Purpose: Optimize cursor-based pagination performance
Last Updated: 2024-11-26

Overview

This guide covers performance optimization strategies for cursor-based pagination.

Index Requirements

Offset-Based

No special index needed - uses default primary key index.

-- Default index sufficient
CREATE INDEX ON nodes(id);

ID-Based

Index on sort column required.

CREATE INDEX idx_nodes_id ON nodes(id);

Value-Based

Composite index required for optimal performance.

CREATE INDEX idx_nodes_created_at_id 
  ON nodes(created_at, id);

Query Performance

Offset-Based Performance

-- Bad: O(n) with large offsets
EXPLAIN ANALYZE
SELECT * FROM nodes LIMIT 25 OFFSET 10000;
-- Seq Scan or slow Index Scan: 500ms+

Problem: Database must scan all skipped rows.

ID-Based Performance

-- Good: O(log n) with index
EXPLAIN ANALYZE
SELECT * FROM nodes WHERE id > 10000 LIMIT 25;
-- Index Scan: 2ms

Benefit: Constant time regardless of position in dataset.

Optimization Tips

1. Use Appropriate Cursor Type

  • Small datasets (< 1000): Offset acceptable
  • Large datasets: ID-based or value-based required
  • Custom sorting: Value-based with composite index

2. Add Proper Indexes

-- For ID-based
CREATE INDEX idx_nodes_id ON nodes(id);

-- For value-based with created_at
CREATE INDEX idx_nodes_created_at_id 
  ON nodes(created_at, id);

-- Include frequently filtered columns
CREATE INDEX idx_nodes_status_created_at_id 
  ON nodes(status, created_at, id)
  WHERE status = 'active';

3. Limit Page Size

const MAX_LIMIT = 100;
const limit = Math.min(requestedLimit, MAX_LIMIT);

Prevents resource exhaustion from large page requests.

4. Cache Cursor Metadata

// Client-side caching
localStorage.setItem('lastCursor', cursor);
localStorage.setItem('totalFetched', totalCount.toString());

5. Use Partial Indexes

For filtered queries, create partial indexes:

CREATE INDEX idx_active_nodes_created_at_id 
  ON nodes(created_at, id)
  WHERE status = 'active';

Performance Comparison

Type Dataset Size Page 1 Page 100 Page 1000
Offset 10K rows 2ms 50ms 500ms
Offset 100K rows 5ms 200ms 2000ms
ID-based 10K rows 2ms 2ms 2ms
ID-based 100K rows 2ms 2ms 2ms
Value-based 10K rows 3ms 3ms 3ms
Value-based 100K rows 3ms 3ms 3ms

Conclusion: Keyset pagination (ID-based/value-based) maintains constant performance.

Monitoring

Query Analysis

-- Check query plan
EXPLAIN ANALYZE
SELECT * FROM nodes WHERE id > 10000 LIMIT 25;

-- Check index usage
SELECT 
  schemaname, tablename, indexname, 
  idx_scan, idx_tup_read, idx_tup_fetch
FROM pg_stat_user_indexes
WHERE tablename = 'nodes';

Performance Metrics

Monitor:

  • Query execution time
  • Index hit rate
  • Cache hit rate
  • Connection pool usage

Troubleshooting

Slow Queries

Check:

  1. Is appropriate index present?
  2. Is index being used? (check EXPLAIN)
  3. Are statistics up to date? (ANALYZE table)
  4. Is page size reasonable?

Fix:

-- Update statistics
ANALYZE nodes;

-- Create missing index
CREATE INDEX idx_nodes_id ON nodes(id);

High Memory Usage

Cause: Page size too large

Fix: Enforce maximum page size in application code.

Index Not Used

Cause: Query planner choosing seq scan

Check:

EXPLAIN SELECT * FROM nodes WHERE id > 10000 LIMIT 25;

Fix: Ensure index exists and statistics are current.

Best Practices

✅ Do

  • Use ID-based or value-based cursors for large datasets
  • Create appropriate indexes before going to production
  • Monitor query performance regularly
  • Set reasonable maximum page sizes
  • Update statistics regularly with ANALYZE

❌ Don't

  • Use offset pagination for large datasets
  • Allow unbounded page sizes
  • Skip index creation
  • Ignore query performance metrics
  • Create unnecessary indexes (maintenance overhead)

Next steps: See API Integration for endpoint patterns.