Thoughts on SPI flash and filesystems

2024-08-04

I've been idly thinking about writing a simple (Q)SPI flash oriented filesystem lately. I think this is an interesting niche, the two file systems I am aware of in this space are:

To be clear, a general purpose file system is not anything I've designed before! I've designed simpler "slot based" or "log ring" filesystems, usually treating the whole flash as a ring buffer (for wear leveling), and overwriting the oldest data. This is much more similar to something like the sequential-storage crate by Dion Dokter from TweedeGolf.

Some facts about SPI flash

Back of an envelope speed numbers

An interesting detail of the speed:size ratio of an external flash is that it becomes somewhat reasonable to scan the entire device occasionally, either for recovery or normal operation.

SpeedSizeSPI time (s)QSPI time (s)
1MHz1MiB8.3886082.097152
1MHz8MiB67.10886416.777216
1MHz16MiB134.21772833.554432
1MHz32MiB268.43545667.108864
1MHz128MiB1073.741824268.435456
8MHz1MiB1.0485760.262144
8MHz8MiB8.3886082.097152
8MHz16MiB16.7772164.194304
8MHz32MiB33.5544328.388608
8MHz128MiB134.21772833.554432
32MHz1MiB0.2621440.065536
32MHz8MiB2.0971520.524288
32MHz16MiB4.1943041.048576
32MHz32MiB8.3886082.097152
32MHz128MiB33.5544328.388608
80MHz1MiB0.10485760.0262144
80MHz8MiB0.83886080.2097152
80MHz16MiB1.67772160.4194304
80MHz32MiB3.35544320.8388608
80MHz128MiB13.42177283.3554432

Therefore if we think about how much reading we can do in roughly one second of wall clock time (okay for "sometimes, not always" operations, like at boot up for initialization):

And if we think about how much reading we can do in roughly sixteen seconds of wall clock time (okay for very rare operations, like in the case where a serious recovery is necessary):

Hmm, this is actually a little less feasible than I thought before. In many cases, the storage flash device is on a secondary interface, many devices that have QSPI are using it for primary code flash, leaving only single SPI for "bulk storage". These secondary interfaces are also typically limited in speed, as well.

I think a reasonable lower target would be single spi, 8MHz. This means that we can expect 1MiB of read in 1s, and 16MiB of read in 16s. Hmm.

Sharing flash for code (XIP) and bulk storage (FS)

Many developers often want to use a single flash for both code and bulk storage. This seems desirable for a couple of reasons:

But there's one pretty huge downside of using the same flash device for code and bulk storage: you usually can't read from flash while you are erasing or writing, which means in most cases, your CPU will stall and block until the erase is complete. This may take >100ms in some cases!

If you are trying to write logs to the filesystem while you are doing something else like spinning a motor, this can have negative side effects!

There are SOME workarounds and tradeoffs possible to mitigate this:

So in most cases, I'd suggest developers use separate flashes for code and bulk storage. This can be either:

Otherwise, your use of the filesystem will be limited to cases where you write extremely rarely, and/or you are okay with the fact that the entire system may stall whenever you need to write and erase

What do we need for a filesystem?

There are a couple of pieces that I think are necessary for a filesystem:

Blocks

We usually want to divide the flash into sectors/blocks, as these are the smallest unit that we can erase at one time. Sectors/blocks are most commonly 4KiB-64KiB.

SizeBlock# of blocks
1MiB4KiB256
1MiB16KiB64
1MiB64KiB16
8MiB4KiB2048
8MiB16KiB512
8MiB64KiB128
16MiB4KiB4096
16MiB16KiB1024
16MiB64KiB256
32MiB4KiB8192
32MiB16KiB2048
32MiB64KiB512
128MiB4KiB32768
128MiB16KiB8192
128MiB64KiB2048

The total number of blocks is important to keep in mind, because we will need to keep track of the state of each block, remembering which blocks are occupied, which ones are unused but need erase before writing, and which ones are erased and ready to be written to.

We'll need to be able to store or determine these states relatively quickly. This is because we'll want some kind of "block allocator", when we want to create a new file or write new data, we'll want to be able to determine where to put it, and whether we need to do an erase before writing.

If we have 32768 blocks, and need to store just two bits of information for each block (is used and is erased), that would require 8KiB of storage to hold that information.

Metadata

We need to store some kind of metadata for each file we store (more on files later). On desktop filesystems, this might include things like permissions, timestamps of creation, modification, or access. It also includes things like the name of the file, and potentially the length of the file

We also may want to use metadata to describe "scattered" files, if the storage required for a file spans multiple blocks. We also may want to use metadata to allow for "overlays", if we want to replace or insert data into a file, without fully copying the file.

Files

Most people want a filesystem so they can store files. Files are of arbitrary size, and have a couple of common access patterns:

Wrapping up

Writing this all down has helped solidify some of the thoughts I had about writing a FS, and also realize that some of the ideas I had are less viable or reasonable than I thought.

Maybe I'll pick up writing on this more later :)