KoaichJoin waitlist
← All explainers

What metadata can an encrypted workspace tool still see?

End-to-end encryption protects content. Metadata — who messages whom, when, how much, from where — has its own privacy properties. Honest framing of what stays visible even on E2E platforms.

When a workspace tool is end-to-end encrypted, the vendor cannot read your message content, documents, or files. That's the headline. The follow-up question — and the one that matters for sophisticated security buyers — is what the vendor can still see.

The answer is metadata. Different from content, metadata is the structural information about communications: who messages whom, when, how often, how much, and from where. Even a well-designed E2E system has metadata; the question is how much, and what's done with it.

What metadata exists in any E2E workspace tool

Every workspace tool, encrypted or not, holds some operational metadata to function. The unavoidable parts:

Account identifiers. Each user has a stable internal ID. Most tools also have an email or phone number associated with the account for authentication, invitations, and external delivery. The mapping {email → account_id → account_creation_date} is essentially the cost of an account-based system.

Connection metadata. When a client connects to the server to send or fetch messages, the server sees the IP address, approximate timing, and connection size. This is true of every TLS-protected service; it's a property of the network layer, not the application.

Message-routing metadata. A message is routed from sender to recipients. The server knows that a message of length N was sent from account A to a group containing accounts B, C, D at time T — even if the content is ciphertext. This is the minimum the server needs to know to deliver the message.

Storage metadata. Files take up disk space. The server knows how many bytes each account has stored. Workspace tools that meter storage need this; ones that don't may not.

What varies between platforms

Above the unavoidable floor, different E2E platforms collect different metadata footprints. The reasoning is usually a feature trade-off — every piece of metadata is the cost of some product feature that needs it.

Contact graph (who messages whom). Signal famously goes to engineering lengths to minimize this — sealed-sender means even Signal's servers don't reliably learn the sender of a given message. WhatsApp, iMessage, and most workspace tools see the full sender-recipient pair on every message.

Group membership. Some tools (Signal, Wire) implement private groups where the server doesn't know who's in a given group. Most workspace tools — including, currently, Koaich — keep group membership server-visible to enforce access control via RLS-style mechanisms. This is a metadata-vs-operations trade-off; cryptographic private groups exist (Signal-style auth credentials) but are operationally more complex.

Display names. When you message someone, do you see their custom display name? If yes, the server typically holds that display name in cleartext. Koaich currently holds vault display names cleartext while we ship encryption for them (in flight); contact display names are already encrypted client-side and stored as ciphertext.

Read receipts and typing indicators. If a tool shows you "Alice is typing" or "Bob read your message," that signal is generated server-side from events the server observed. End-to-end metadata privacy and read receipts are partially in tension; some tools opt out of read receipts to preserve metadata privacy.

What metadata can be inferred even from encrypted content

Even if the server only holds ciphertext, it can infer some things from the structure of the encrypted traffic:

Activity patterns. A user who sends a lot of messages between 9 AM and 5 PM Pacific Time on weekdays is probably a professional working in California. A user who suddenly stops sending messages might be on vacation or have been arrested. The server observes timing without needing content.

Volume signals. A 10MB encrypted file is presumably a presentation or video. A 100-byte encrypted message is presumably a short reply. The ciphertext size is observable.

Relationship strength. If A and B exchange 50 messages a day for 3 years, then 0 messages for 6 months, the server can model the change without reading any of the messages.

Some of these can be mitigated cryptographically. Constant-rate padding hides message sizes; cover traffic hides timing. Most workspace tools don't do these because they're expensive and the threat model usually doesn't justify them.

How honest vendors frame this

A vendor that says "zero-knowledge" without qualification is overclaiming for almost any service that has accounts. The honest position is: we hold ciphertext for content, plus the minimum operational metadata we need to deliver the service. If you can articulate which metadata you hold and why, you're being honest. If you wave at "zero-knowledge" without specifics, you probably haven't done the audit.

What Koaich holds — and intentionally so — is documented in /security §6.5: message timestamps, sender-recipient pairings, vault membership, account email at signup, storage usage. Content is encrypted. Vault display names are in flight (encryption ships in current backend workstream). Contact display names are already encrypted.

The reasonable test for any vendor: if a journalist published their privacy policy alongside a side-by-side architecture diagram, would the architecture match the marketing? For most workspace tools today, the answer is no. The few where the answer is yes — Signal, Proton Mail, Koaich — make their position explicit because the position is the value.

Frequently asked questions

Can an end-to-end encrypted service see who I'm talking to?

Usually yes. Most E2E platforms see the sender-recipient pair on every message — that's the minimum the server needs to route messages. Signal uses sealed-sender to hide this; most workspace tools don't because the trade-off doesn't justify the operational complexity at workspace scale.

What metadata does Signal collect?

Signal collects the account creation date and the last connection date (rounded to day-level granularity). The phone number is associated with the account but stored in a form that's resistant to bulk lookups. Sealed-sender prevents Signal's servers from reliably learning who sent a given message. The metadata footprint is the smallest of any production E2E messenger.

What metadata does Koaich collect?

Account identifiers (email at signup), message timestamps and sender-recipient pairings, vault membership, storage volumes, IP addresses on connection. Content (messages, documents, files, AI prompts) is encrypted under user-held keys. The full inventory with rationale per item is at /security §6.5.

Can an attacker infer my activity from encrypted metadata?

Some inference is possible from timing patterns and message volumes. Constant-rate padding and cover traffic mitigate this; most workspace tools don't deploy them because the operational cost is high and the threat model usually doesn't justify them. For users where this matters (journalists protecting sources, dissidents in surveillance contexts), Signal's metadata-minimization model is the higher bar.

Keep reading

Workspace privacy, by default.

Get on the Koaich waitlist.

Pre-launch · No spam · Unsubscribe anytime