sofetch: A low-boilerplate Haxl-like data fetching library

[ bsd3, data, library, unclassified ] [ Propose Tags ] [ Report a vulnerability ]

Please see the README on GitHub at https://github.com/githubuser/sofetch#readme

[Skip to Readme]

Modules

[Index] [Quick Jump]

Fetch

Flags

Manual Flags

Name	Description	Default
examples	Build example executables	Disabled

Use -f <flag> to enable a flag, or -f -<flag> to disable that flag. More info

Downloads

sofetch-0.1.0.2.tar.gz [browse] (Cabal source package)
Package description (as included in the package)

Maintainer's Corner

Package maintainers

IanDuncan

For package maintainers and hackage trustees

edit package information

Candidates

No Candidates

Versions [RSS]	0.1.0.0, 0.1.0.1, 0.1.0.2
Change log	CHANGELOG.md
Dependencies	aeson, async, base (>=4.7 && <5), bytestring, containers, exceptions, hashable, http-client, http-client-tls, http-types, semigroupoids, sofetch, sqlite-simple, text, time, transformers, unliftio-core, unordered-containers [details]
License	BSD-3-Clause
Copyright	2026 Ian Duncan
Author	Ian Duncan
Maintainer	ian@iankduncan.com
Uploaded	by IanDuncan at 2026-02-14T21:01:19Z
Category	Data
Home page	https://github.com/iand675/sofetch#readme
Bug tracker	https://github.com/iand675/sofetch/issues
Source repo	head: git clone https://github.com/iand675/sofetch
Distributions
Reverse Dependencies	1 direct, 0 indirect [details]
Executables	sqlite-blog, github-explorer
Downloads	5 total (5 in the last 30 days)
Rating	(no votes yet) [estimated by Bayesian average]
Your Rating	λ λ λ
Status	Docs available [build log] Last success reported on 2026-02-14 [all 1 reports]

Readme for sofetch-0.1.0.2

[back to package description]

That's so fetch

The problem

Suppose you have a web page that shows a list of blog posts, each with its author's name. A naive implementation fetches each author one at a time:

-- Fetch each author individually, one query per post!
renderPosts :: [Post] -> AppM [Html]
renderPosts posts = forM posts $ \post -> do
  author <- getUser (postAuthorId post)    -- DB round-trip
  pure (renderPostCard post author)

Ten posts means ten separate database queries. A hundred posts means a hundred queries. This is the N+1 problem: you run 1 query to get the list, then N more queries to get each related item. It's one of the most common performance pitfalls in data-access code, and it's easy to introduce without noticing because each function in isolation looks perfectly reasonable.

The typical fix is to restructure your code: collect all the IDs up front, run a single batched query, then stitch the results back together. That works, but it forces your code shape to match your optimisation strategy. Composition suffers: you can't freely combine small functions without worrying about the data-access pattern they produce.

The solution

sofetch fixes this automatically. Write simple, sequential-looking code, and sofetch batches and deduplicates your data access behind the scenes:

renderPosts :: (MonadFetch m n, DataSource m UserById) => [Post] -> n [Html]
renderPosts posts =
  -- All author fetches are batched into ONE query, automatically.
  fetchThrough (UserById . postAuthorId) posts
    <&> map (\(post, author) -> renderPostCard post author)

No matter how many posts you have, this issues a single WHERE id IN (...) query for all the authors. You didn't have to restructure anything. You wrote the obvious code and sofetch made it fast.

N+1 queries vs 1 batched query with sofetch

This works across function boundaries too. If renderPostCard internally fetches comment counts, and renderSidebar fetches the same authors for a "top contributors" widget, sofetch merges all of those fetches together. Functions that were written independently, without any knowledge of each other, still get optimal batching when composed.

How it works (in brief)

sofetch gives you a special Fetch monad. When you write:

(,) <$> fetch (UserById 1) <*> fetch (UserById 2)

...the two fetches don't happen immediately. Instead, sofetch collects them into a round, groups them by data source, and dispatches one batched call per source. The <*> operator (or ApplicativeDo if you prefer do-notation) is the signal that two fetches are independent and can be batched together. The >>= operator (monadic bind) introduces a round boundary: the right side depends on the left side's result, so it has to wait.

flowchart LR
  f1["fetch (UserById 1)"] --> b1["batchFetch<br/>[UserById 1, 2]"]
  f2["fetch (UserById 2)"] --> b1
  f3["fetch (PostsByAuthor 1)"] --> b2["batchFetch<br/>[PostsByAuthor 1]"]
  b1 -. "concurrent" .- b2

Within each round:

Keys for the same data source are grouped into one batchFetch call.
Keys for different data sources run concurrently.
Duplicate keys are deduplicated. The same key appearing in multiple places produces only one fetch, and all callers share the result.
Results are cached so the same key never hits the database twice (unless you opt out).

Quick start

1. Define key types

Each kind of data you want to fetch gets a key type, a small type that says "I want to look up this thing" and declares what the result will be. This is the core modelling step: one key type per query shape.

{-# LANGUAGE DeriveGeneric, DeriveAnyClass, DerivingStrategies, TypeFamilies #-}

data User = User { userId :: Int, userName :: Text }
data Post = Post { postId :: Int, postAuthorId :: Int, postTitle :: Text }

-- "Give me a user by their ID"
newtype UserById = UserById Int
  deriving stock (Eq, Ord, Show, Generic)
  deriving anyclass (Hashable)

instance FetchKey UserById where
  type Result UserById = User

-- "Give me all posts by this author"
newtype PostsByAuthor = PostsByAuthor Int
  deriving stock (Eq, Ord, Show, Generic)
  deriving anyclass (Hashable)

instance FetchKey PostsByAuthor where
  type Result PostsByAuthor = [Post]

The key type carries the query parameter (the user ID, the author ID) and the FetchKey instance tells sofetch what type the answer will be. All the required instances (Eq, Hashable, Show, etc.) are stock-derivable, no boilerplate.

2. Teach sofetch how to fetch them

A DataSource instance tells sofetch how to batch-fetch a group of keys. You receive a NonEmpty list of keys and return a HashMap of results, one entry per key:

instance DataSource AppM UserById where
  batchFetch keys = do
    pool <- asks appPool
    let ids = [uid | UserById uid <- toList keys]
    rows <- liftIO $ withResource pool $ \conn ->
      query conn "SELECT id, name FROM users WHERE id = ANY(?)" (Only ids)
    pure $ HM.fromList [(UserById (userId u), u) | u <- rows]

The AppM parameter is your monad. If it has access to a connection pool, config, or anything else, your data source has access to it too. No special environment setup is needed.

If your backend doesn't support batch lookups (e.g. a REST API that only fetches one item at a time), implement fetchOne instead. sofetch will call it for each key:

instance DataSource AppM UserById where
  fetchOne (UserById uid) = lookupUserById uid

You still get deduplication and caching; you just don't get the batched SQL.

3. Write data-access code

Now use fetch in your application code. Program against the MonadFetch typeclass so your functions work with any implementation (production, tests, tracing):

getUserFeed :: (MonadFetch m n, DataSource m UserById, DataSource m PostsByAuthor)
            => Int -> n (User, [Post])
getUserFeed uid =
  (,) <$> fetch (UserById uid) <*> fetch (PostsByAuthor uid)

These two fetches are independent (<*>), so sofetch batches them into a single round. If you prefer do-notation, enable ApplicativeDo and write the equivalent:

{-# LANGUAGE ApplicativeDo #-}

getUserFeed uid = do
  user  <- fetch (UserById uid)        -- batched together
  posts <- fetch (PostsByAuthor uid)   -- in one round
  pure (user, posts)

Both forms produce identical batching behaviour.

4. Run it

handleRequest :: AppEnv -> Int -> IO (User, [Post])
handleRequest env uid = runAppM env $ do
  cfg <- fetchConfigIO
  runFetch cfg (getUserFeed uid)

fetchConfigIO works for any MonadUnliftIO monad (which includes any ReaderT env IO stack, the most common pattern). It wires everything up automatically.

5. Test it

Swap the real data sources for canned data. No IO, no database:

testGetUserFeed :: IO ()
testGetUserFeed = do
  let mocks = mockData @UserById       [(UserById 1, testUser)]
           <> mockData @PostsByAuthor   [(PostsByAuthor 1, [testPost])]
  (user, posts) <- runMockFetch @AppM mocks (getUserFeed 1)
  assertEqual user testUser
  assertEqual posts [testPost]

Because getUserFeed is polymorphic over MonadFetch, it runs unchanged against MockFetch. No special test wiring needed.

A real example: collapsing N+1 cascades

Here's a scenario from the included SQLite example. A blog page needs to render three authors, each with their posts, each post with its comments, each comment with its author name. The functions are written independently at four different levels:

renderBlogPage                    fetches 3 authors
  └─ renderAuthorProfile          fetches posts for an author
       └─ renderPostWithComments  fetches comments for a post
            └─ renderComment      fetches the comment's author

Without sofetch, this is 25+ database queries. With sofetch, traverse automatically merges fetches at the same depth:

flowchart LR
  subgraph R1 ["Round 1"]
    A1["UserById 1, 2, 3"]
  end
  subgraph R2 ["Round 2"]
    A2["PostsByAuthor 1, 2, 3"]
  end
  subgraph R3 ["Round 3"]
    A3["CommentsByPost 1 … 7"]
  end
  subgraph R4 ["Round 4"]
    A4["UserById 4, 5 (deduped)"]
  end
  R1 --> R2 --> R3 --> R4

4 rounds, 4 SQL queries, regardless of the data size. The functions never coordinate with each other. They don't know they're being composed. sofetch handles it.

Key features

No GADTs. Data sources are ordinary typeclasses. Key types use stock deriving. If you've defined a newtype, you're 90% of the way to a data source.
Your monad, your resources. DataSource is parameterised by your monad, not some framework environment. Connection pools, config, whatever your monad carries, your data sources have access to it. Missing instances are compile-time errors, not runtime crashes.
Monad transformer. Fetch m a layers over your existing monad stack. Drop it in without restructuring your application.
Swappable implementations. MonadFetch is the interface your application code uses. Production, test, and traced implementations all satisfy it. Swap without code changes.
Extensible instrumentation. runLoopWith lets you wrap each batch round (e.g. with tracing spans). OpenTelemetry support lives in the separate sofetch-otel package.

flowchart TD
  A["Application code"] -->|"programs against"| B["MonadFetch (typeclass)"]
  B --> C["Fetch m<br/>production"]
  B --> D["MockFetch<br/>testing"]
  B --> E["TracedFetch<br/>instrumentation"]
  C --> F["DataSource instances<br/>UserById · PostsByAuthor · …"]

Combinators

sofetch includes a toolkit for common patterns:

Combinator	What it does
`fetchAll keys`	Fetch a list of keys in one round
`fetchThrough toKey items`	Extract a key from each item, fetch, pair back
`fetchMap toKey combine items`	Like `fetchThrough` but transform the pair
`fetchMaybe maybeKey`	Fetch if the key is present
`fetchMapWith keys`	Fetch a collection, return a `HashMap` of results
`filterA predicate items`	Applicative filter; all predicates batched
`withDefault val action`	Return a default on any exception
`pAnd` / `pOr`	Parallel short-circuiting boolean combinators

Advanced usage

Shared cache across phases

To preserve the cache across sequential computations, use runFetch' which returns the cache alongside the result:

handleTwoPhases :: AppEnv -> IO [Post]
handleTwoPhases env = runAppM env $ do
  cfg <- fetchConfigIO

  -- Phase 1: populate cache
  (_users, cache) <- runFetch' cfg $
    fetchAll [UserById 1, UserById 2, UserById 3]

  -- Phase 2: cached keys resolve without hitting the DB
  runFetch cfg { configCache = Just cache } $
    fetchAll [PostsByAuthor 1, PostsByAuthor 2]

Restricted monads (no MonadIO)

For monads that deliberately hide IO (e.g. a Transaction type that prevents arbitrary IO inside database transactions), use fetchConfig with explicit natural transformations and export a safe runner:

fetchInTransaction :: Fetch Transaction a -> Transaction a
fetchInTransaction = runFetch (fetchConfig unsafeRunTransaction unsafeLiftIO)

The unsafe escape hatches stay private to your DB module. Application code calls fetchInTransaction and never touches IO.

See examples/SqliteBlog.hs (scenario 12) for a worked proof-of-concept.

Examples

The examples/ directory contains two runnable programs:

stack build --flag sofetch:examples
stack exec sqlite-blog
stack exec github-explorer

SQLite blog (examples/SqliteBlog.hs): A blog platform backed by in-memory SQLite. Every batchFetch prints its SQL so you can see exactly how fetches are batched. Covers applicative batching, N+1 avoidance, deduplication, deep N+1 across function boundaries, faceted queries, chunked batching, shared caches, mocks, and restricted monads.

GitHub explorer (examples/GitHubExplorer.hs): Concurrent exploration of the GitHub REST API. Demonstrates sofetch with HTTP backends where the value is concurrency, deduplication, and caching rather than SQL batching.

Packages

Package	Description
sofetch	Core library: `Fetch`, `DataSource`, `MonadFetch`, cache, engine, mocks, tracing hooks
sofetch-otel	OpenTelemetry instrumentation via `runFetchWithOTel`

Modules

Module	Contents
`Fetch`	Top-level re-exports
`Fetch.Class`	`FetchKey`, `DataSource`, `MonadFetch`, `MonadFetchBatch`, `Status`, `Batches`
`Fetch.Batched`	`Fetch` monad transformer, runners, `runLoopWith`
`Fetch.Engine`	Batch dispatch with strategy-based scheduling
`Fetch.Cache`	IVar-based cache with dedup, eviction, warming
`Fetch.IVar`	Write-once variable with error support
`Fetch.Combinators`	`fetchAll`, `fetchThrough`, `fetchMap`, etc.
`Fetch.Mock`	`MockFetch` for testing
`Fetch.Traced`	`TracedFetch` with per-round callbacks
`Fetch.Mutate`	`Mutate` for interleaved read-write computations
`Fetch.Memo`	`MemoStore`, `memo`, `memoOn`
`Fetch.Deriving`	Helpers for writing instances (`optionalBatchFetch`, DerivingVia docs)

Design

See docs/DESIGN.md for the full set of design decisions and tradeoffs.

_{sofetch is inspired by Facebook's Haxl
(Marlow et al., There is no fork: an abstraction for efficient, concurrent,
and concise data access, ICFP 2014). It keeps the core idea (write
sequential-looking code, get batched data access) while replacing the
GADT-based data source API with type families and ordinary typeclasses, and
using a monad-transformer design instead of a bespoke environment. See
DESIGN.md for a detailed comparison.}