Secure client data automation for consultants is a custodian count problem, not an encryption problem.

Matthew Diakonov, Written with AI

Published April 29, 202612 min read

Direct answer (verified 2026-04-29)

Move the automation off cloud platforms and onto your own machine. Tools like Zapier, Make, hosted n8n, and HoneyBook ingest your client data into their servers, which makes them a second permanent custodian on top of Gmail and your CRM. A local-first agent (the category Clone sits in) drives your existing apps from the OS layer using the OAuth tokens already on your laptop, and writes its memory and audit log to ~/.clone/ on disk. Client data does not get a second custodian. With an optional local LLM, model inference runs on-device too. Verified against the architecture and design principles documented in the product’s public llms.txt.

Almost every guide on this topic answers the same way: pick a vendor with a SOC 2 Type II report, make sure transit is TLS 1.2 or better, set up role-based access control on the workspace, and you are done. That answer is the floor, not the ceiling. It assumes a fixed architecture, a cloud automation platform sitting between your inbox and your CRM, and then asks how to harden that platform.

The architectural question is the one that goes unasked: should there be a platform sitting between your inbox and your CRM at all? For a solo consultant or a boutique advisory firm, the honest answer is usually no. Adding a cloud automation tool means you have moved from one custodian of your client data (your existing app stack) to two (your apps, plus the automation vendor). Each new custodian is its own retention window, its own breach surface, its own subprocessor list, and its own row on the data-processing-addendum your client will eventually ask you to sign.

The fix is structural. Run the automation on your own machine, drive your existing apps from the OS layer, write the memory and the audit log to disk under your home directory. The encryption and access-control questions still matter, but they collapse to the questions you already answer for your laptop, not new questions you have to answer for a vendor. This page walks the data path under each architecture, shows the audit-log file Clone writes on disk, and names the three knobs that tighten the boundary further when an engagement requires it.

2 → 1

“A cloud automation tool turns one custodian of your client data into two. A local-first agent keeps it at one. The encryption questions are the same; the custodian count is what changes.”

Calibrated against the architecture documented in /Users/matthewdi/ai-for-consultants/website/public/llms.txt and observed against the subprocessor lists of Zapier, Make, HoneyBook, and hosted n8n.

The data-egress map

Where does the message body actually go?

Trace a single client email from arrival to logged-into-CRM. The shape of the path is the security model. On the left: the apps that already hold the data. In the middle: the automation engine. On the right: where Clone writes its memory and audit log on your machine.

Sources → Clone (local) → Local artifacts

Notice what is not on the diagram. There is no “Clone Cloud” box. There is no third-party database holding a copy of the email thread, the transcript, or the CRM record. The planner reads from the apps you already gave OAuth to, decides what to do, and writes back to those same apps. The two persistent artifacts (memory, log) live on disk under your home directory. The only egress is the writes Clone performs on your behalf, to the apps you would have written to yourself.

Cloud architecture

The same workflow, on a cloud automation platform.

Same client email, same destination CRM, different middle. The cloud automation tool persists each step. The message body lands in their database. The enriched note lands in their database. The webhook acknowledgement lands in their database. Every persisted record is a row that has to be retained, indexed, audited, and eventually disclosed in a subprocessor or breach context.

Cloud automation: each step persists in the vendor store

By the time the workflow finishes, the message body and the client identity have been copied at least once into the vendor’s database. The transit is encrypted; that is the easy part. The custodian list now reads: Gmail, the cloud tool, HubSpot. Three custodians for one engagement. The same workflow run on your machine looks like this instead.

Local-first: persistence is on your disk

The shape is similar at a glance, but the persisted-rows column is gone. Clone’s “append action entry to disk” writes to your local file system. The CRM still receives the enriched note over OAuth, exactly as it would in the cloud version. The custodian list is back to two: Gmail, HubSpot. Your laptop holds the operational memory and the audit trail, and your laptop is already on your security policy.

The audit log on disk

You can grep the audit trail with the same tools you grep your code.

Every action Clone takes appends one entry to ~/.clone/log/<date>.json. The entry carries a timestamp, the handler name, the trigger event, the target app, the action verb, the activity id stamped by the target service, the fields touched, and a rollback pointer. There is no remote dashboard. The dashboard is a JSON file. The query language is jq.

~/.clone/log/2026-04-29.json

When a client asks what happened on their account at 2:02pm Tuesday, you grep the log for the timestamp and the answer is in front of you. When a client asks for the trail to be deleted at the end of an engagement, you rm the relevant date range. When a handler writes the wrong thing, the rollback pointer tells you exactly which action to reverse. The whole audit and reversibility surface is files in your home directory, governed by your own access policy.

Three knobs

The boundaries you actually choose.

Local-first is the default; everything below is a tightening knob you can apply when an engagement requires it. Each one is a configuration change in ~/.clone/memory/. None of them require a vendor escalation, a procurement cycle, or a new contract.

Knob 1

Local LLM for planning

Point Clone’s planner at a local model (llama.cpp, Ollama, or a private cloud endpoint inside your tenancy). The prompt content used to plan actions stops leaving your machine. Useful when an engagement’s data-handling clause forbids third-party model providers from seeing client content.

Knob 2

Per-client exclusions

Tag a client as sensitive in policy.md and every handler skips threads, transcripts, and records associated with that client. The handlers still run for the rest of your book. Common for healthcare, legal, and any engagement under a heightened NDA.

Knob 3

Log retention window

Default is per-day JSON files kept until you delete them. For engagements that require a tighter retention window (24 hours, 7 days), rotate ~/.clone/log/ on cron. The rollback surface tightens with the window; the operational memory in memory/ is independent and unaffected.

The honest version of the trade

What you give up by going local.

The local-first architecture is not a free lunch. Three trades are real and worth saying out loud. The agent only runs while your laptop is awake; scheduled handlers fire on your machine, not on a hosted scheduler. A team of more than a few consultants needs per-seat installs and a shared loop file in the firm’s drive (the boutique tier is built around this shape). And the audit trail is local, which means losing an unencrypted laptop loses the trail; full-disk encryption and a backup posture are part of the contract.

The trade most consultants will accept: keep one custodian instead of two, accept that the agent runs when your machine is awake, and own the audit trail yourself. It is the same trade you make every day for your inbox, your file system, and your password manager. Adding an automation layer should not change the threat model; it should inherit it.

Want to see the local audit log on your machine?

Twenty minutes, screen share, we walk through what gets written to ~/.clone/ when a real Zoom call ends, and answer the questions your client’s DPA is going to ask.

Frequently asked questions

What does “secure client data automation” actually mean for a solo consultant?

It means three things, in order. First, the data path: every byte of client information your automation handles either stays inside the tool that already holds it (Gmail, your CRM, QuickBooks) or it gets copied somewhere else. Second, the custodians: how many separate parties end up holding a copy of the same client conversation, and which ones you signed an NDA with. Third, the audit trail: when something goes wrong six months from now, can you tell exactly what was written, where, and when. Most security pages on this topic only cover the first item (encryption in transit) and skip the other two. The custodian count is where consultants actually take risk.

Why is a cloud automation tool a second custodian if I already use Gmail and HubSpot?

Because the cloud tool is a separate company, with a separate database, a separate retention policy, and a separate breach surface. When you connect Zapier to Gmail, Zapier reads your inbox via OAuth and stores the message body, attachments, and metadata in their own database (their own SOC 2 auditor, their own data residency, their own subprocessor list). The same thing happens with Make, with hosted n8n, and with HoneyBook when you import a contact. Your tools (Gmail, HubSpot) are one custodian. The automation platform is a second one. If the second one is breached, your client data is in the breach; the fact that Gmail was not breached does not help.

Is a local-first agent really different, or is it the same data going through different pipes?

It is different. Clone reads your Gmail thread or your Zoom transcript on your machine, decides what to do (draft a follow-up, log a note in HubSpot, send an invoice), and executes those actions using the same OAuth tokens you would use yourself. The thread body, the transcript, the action plan, and the audit log all live on disk under your home directory. The only network traffic is to the apps you would have hit anyway (Gmail, HubSpot, QuickBooks). Clone does not store a copy of the email thread on a remote server. There is no Clone-side database that holds your client data. That is what the local-first claim means in concrete terms.

Where exactly does Clone keep my data on disk, and what is in those files?

Two directories under your home folder. ~/.clone/memory/ holds plain-text handler files: post-call.md, invoice.md, dunning.md, onboarding.md, weekly-report.md, monthly-bookkeeping.md, quarterly-pipeline.md. Each one declares its trigger event, the target apps, and the action steps. ~/.clone/log/<date>.json is the action log: one JSON entry per action Clone took, with the timestamp, handler name, target app, action, the activity id stamped by the target service, and a rollback pointer. You can read both directories in TextEdit. You can grep them with the same tools you already use. You can delete them and start over.

What about the LLM that Clone uses to plan actions? Does that send my data to a model provider?

By default Clone uses a hosted LLM for planning, and the planning prompt does include the relevant context (the email body Clone is responding to, the call transcript snippet, the CRM record being updated). For consultants whose engagements forbid that, Clone supports connecting a local or private LLM, so model inference runs on your machine or inside your private cloud. This is the design principle from src/components/architecture.tsx: “your data stays local” is a configurable boundary, not a marketing claim. If you have a llama.cpp or Ollama instance running on your laptop, point Clone at it and the model layer is on-device too.

I already have a SOC 2 / ISO 27001 questionnaire from a client. How does this answer it?

Most of the questionnaire fields collapse from “we rely on subprocessor X with their own SOC 2 report” to “the data does not leave the consultant’s machine.” Subprocessor list: empty (or just the apps the consultant already discloses). Data residency: wherever the consultant’s laptop is. Data retention: whatever the consultant configures on disk. Breach notification: governed by the consultant’s own policy, not a vendor’s. Two questionnaire fields remain: the planning-LLM provider (configurable, can be set to local) and the OS-level access Clone needs (accessibility APIs and browser control on the consultant’s own machine, audited via the on-disk log). For most boutique-firm engagements those two answers are easier to defend than the long subprocessor lists a Zapier or HoneyBook deployment generates.

How does the audit log help when a client asks what happened on their account?

Each action written to ~/.clone/log/<date>.json has the activity id stamped by the target service (the HubSpot engagement id, the Gmail draft id, the Stripe invoice id). When a client asks “who updated my pipeline note at 2:14pm yesterday?” you grep the log for the timestamp and you have: which handler ran, which trigger fired it, what fields were touched, what record id the change carries. That is more granularity than most cloud automation platforms surface in their UI, and it lives on your laptop, not behind their support portal.

What if my laptop is lost or stolen? Is the data on disk a bigger problem than data in the cloud?

Use full-disk encryption (FileVault on Mac, BitLocker on Windows). With FileVault on, a stolen laptop is a brick to anyone without your password, and ~/.clone/ inherits the same protection as the rest of your home folder. The threat model improvement is real: a stolen laptop is one device, recoverable through Find My, with a known custodian (you). A breach of a cloud automation vendor is many devices, often discovered months later, governed by their disclosure timeline. Disk encryption + a recent backup is a stronger posture than “trust the vendor” for most boutique consulting engagements.

Can I roll back an action if Clone wrote the wrong thing into a client’s record?

Yes. Every entry in the audit log carries a rollback pointer: for a Gmail draft, delete_draft with the draft id; for a HubSpot engagement, delete_engagement with the engagement id; for a Stripe invoice, void_invoice with the invoice id. One click in the review surface reverts a single action. You can also revert an entire morning’s worth of actions in one pass by selecting a time range. The architectural commitment is: every action Clone takes is logged and reversible. The reversibility comes from those rollback pointers, written at the moment the action is taken.

What about engagements that explicitly forbid running third-party automation on client data, like a healthcare or legal client?

Three knobs cover most of those. First, configure the planning LLM to a local model so prompt content stays on your machine. Second, exclude specific clients with a tag in ~/.clone/memory/policy.md so handlers skip threads, calls, or records flagged sensitive. Third, run Clone with no log retention beyond the current day if your engagement requires it (the log file is just a JSON file under your home directory, you can rotate it on cron). That is enough to satisfy most engagement-specific data-handling clauses without giving up automation on the rest of the book.

Other guides on how the loops actually run on your machine.