A table anonymiser

What I built (github) is a first step, not the destination. An anonymiser can strip identifiers from a billing file like a CUR, and mine is designed to work across table formats, not just AWS, but the structure of the data still carries meaning. Even without names, a detailed billing file reveals patterns of investment, priorities, and architecture choices. Over time, those patterns can be read almost like a balance sheet of a cloud estate, exposing how a company operates and where it is placing its bets.

The limitation is not technical, it is conceptual. We tend to think anonymisation is about hiding fields, when the real issue is that the shape of the data itself is informative. A single file, even carefully processed, can still be traced back in spirit if not in name. That is why the real opportunity sits elsewhere.

It emerges when many billing files are brought together in a controlled, secure repository. At that point, the focus shifts from individual records to statistical transformation: averages, distributions, trends. The data stops describing a company and starts describing a system. You no longer see who did what; you see how an industry behaves.

That shift matters because it changes what can be shared. Instead of exposing strategy, organisations contribute to a collective signal that is both anonymous and meaningful. The goal is not better anonymisation—it is the creation of datasets that are structurally incapable of revealing any single participant, while still rich enough to inform decisions.