07

Service Capability

AI agents and custom GPTs

Custom GPTs, Claude Projects, and autonomous AI agents for routine business tasks. Built with safety rails, audit logging, and human-in-the-loop on consequential actions — not the move-fast-and-break-things agent pattern that produced OpenClaw.

Last updated 12 May 2026

AI agents are the most over-hyped and under-deployed category in business AI as of 2026. The over-hype: every AI vendor selling "autonomous agents" that supposedly handle complex multi-step work without human oversight. The under-deployment: most businesses still don't have a single AI agent in production because the genuine deployments require careful design that most consultants and vendors skip.

What follows is what we actually build for Australian businesses around AI agents in 2026 — sized for both lightweight deployments (custom GPTs for specific team workflows, Claude Projects for specialist roles) and more substantial agent platforms (MCP-based agents that orchestrate multi-step tasks with structured tool access). The deployments use the platforms that work in production: OpenAI's Custom GPTs, Claude Projects, Microsoft Copilot Studio, and MCP-based agents on Claude or GPT-4-class models.

The architectural rule that separates working agents from disasters: every consequential action requires human approval, and every action is audit-logged. The OpenClaw security crisis of early 2026 happened because agents had unchecked tool access — that's not an agent design problem, it's a permission design problem, and we build for it from day one.

The Reality

Why AI in AI agents is harder than vendors admit

1. Agent security boundaries are easy to get wrong

The OpenClaw incident (March 2026) was a stark reminder that AI agents with broad tool access are a security risk if the permission model isn't carefully designed. Agents that can read your email AND modify your files AND send external messages can be tricked or manipulated in ways that single-purpose tools can't. We build with strict tool permissions, audit logging on every action, and human approval gates on consequential operations.

2. Custom GPTs sound simple but production-grade is harder

Building a Custom GPT or Claude Project for a specific business task is straightforward in a demo. Making it work reliably with your actual data, in your actual voice, against the edge cases your business actually has, is much harder. Most internal-build attempts hit a wall at the production phase. Our deployments include the production hardening work most teams skip.

3. Autonomous agents handle complex tasks poorly

The industry hype suggests AI agents can handle complex multi-step business work autonomously. In practice, current-generation agents handle 3-5 step structured tasks reasonably well, struggle with anything that requires real judgment, and fail badly when something unexpected happens. We design agents for the work they're actually good at — structured, repeatable, with clear success criteria — not for tasks that require human judgment.

4. Custom GPTs and Claude Projects are vendor-specific

OpenAI's Custom GPTs only run inside ChatGPT. Claude Projects only run inside Claude. Microsoft Copilot Studio agents only run inside Microsoft 365. Building deeply on any one of these creates vendor lock-in. For workflows you expect to keep for years, we often recommend MCP-based agents instead — same capabilities, vendor-neutral. For quick wins on specific team workflows, Custom GPTs are fine.

What We Build

5 AI AI agents workflows delivering ROI in 2026

These are the workflows we actually deploy. Ranked by typical ROI per dollar invested.

01

Custom GPTs for specialist team workflows

Specialist team members get AI assistance tuned exactly to their role — proposal writers, customer success, technical support, sales engineers. 30-60% time recovery on routine specialist work.

Custom GPTs (OpenAI) or Claude Projects (Anthropic) tuned for specific business roles — loaded with the right context (your firm's documentation, the team's prior work, role-specific frameworks), with permitted tools, and with a defined task scope. The specialist uses the GPT like a colleague who's read all the relevant context. Not a replacement; an assist.

Tools we use: OpenAI Custom GPTs (inside ChatGPT Team or Enterprise) or Claude Projects (inside Claude for Work). Loaded with role-specific knowledge files. Always for assist, never for autonomous action.

02

MCP-based agents for cross-system task orchestration

Multi-step business tasks (research a prospect, draft a proposal, summarise a project, prepare a brief) get handled by AI agents with structured tool access.

MCP (Model Context Protocol) servers expose your business tools — CRM, accounting, ticketing, calendar, document store — to AI agents through a standardised interface. Claude or GPT agents can compose multi-step tasks on demand, pulling context from your systems and taking actions through the tool layer. Every action is audit-logged. Consequential actions require human approval.

Tools we use: MCP servers (custom or hosted) + Claude Sonnet/Opus or GPT-4-class agents + your business systems. Self-hosted MCP for sensitive data. Always with explicit tool permissions and structured approval gates.

03

Internal task agents with guarded tool access

Repeatable internal tasks (researching, drafting, summarising, scheduling) get reliable agent handling without the safety risks of broad-access deployments.

AI agents for specific internal tasks — research a topic, draft a document type, summarise a meeting, schedule against availability constraints, prepare a brief — with strictly scoped tool access. Each agent has a defined task, a permitted toolset, and a known approval boundary. No agent has unfettered access; every agent is auditable. This is the OpenClaw lesson applied: agents should be narrow specialists, not general-purpose assistants with everything-access.

Tools we use: Claude or GPT agents inside a wrapper that enforces tool permissions, action logging, and approval gates. Increasingly built on MCP for portability.

04

AI agents for customer-facing routine tasks

Customer-facing routine tasks (appointment scheduling, status updates, common information requests) get handled reliably 24/7.

Customer-facing agents handle routine queries — scheduling, status updates, common information requests — with strict scope and conservative routing. Anything outside the defined routine pattern routes to a human within seconds, with full context preserved. Build for the predictable workflow excellently; route everything else to humans. This is the pattern that works for AU SMBs across trades, professional services, medical, and property management.

Tools we use: OpenAI Realtime API / Vapi / Bland.ai for voice; web chat via Custom GPT or custom interface for text. Always with structured handoff to human agents.

05

Agent-orchestrated knowledge work

Knowledge work that previously required multi-tab context switching (researching, drafting, scheduling, communicating) becomes single-conversation interactions.

Specialist agents that orchestrate knowledge work across multiple tools — sales reps, customer success managers, project managers can ask an agent to 'prepare the briefing for tomorrow's client meeting' or 'draft the project update for the steering committee' or 'pull together the context for the QBR' — and get the result, ready for review and personalisation. Agent does the assembly; human does the judgment.

Tools we use: MCP-based agent + Claude Sonnet/Opus or GPT-4-class model + your business systems exposed as MCP tools. Increasingly the production pattern for complex AI deployments.

Recommended Stack

Tools we build on for AI AI agents

These are the systems we build AI on top of, not products we sell. Choice depends on your business size and existing stack.

OpenAI Custom GPTs (ChatGPT Team / Enterprise)

Quick wins on specific team workflows. Built and updated inside ChatGPT. Vendor-locked but fast.

Claude Projects (Claude for Work)

Specialist roles needing deep context loading. Strong on long-context reasoning. Vendor-locked but capable.

Microsoft Copilot Studio

Microsoft 365 / Azure ecosystem deployments. Enterprise-friendly with deep M365 integration.

MCP-based custom agents

Vendor-neutral agent deployments where portability matters. Increasingly the production pattern for serious AI work.

OpenAI Realtime / Vapi / Bland.ai

Voice-based customer-facing agents (e.g., after-hours phone answering).

Anthropic / OpenAI / Azure AI (AU East)

Foundation model providers. All have AU-region data residency for compliance-sensitive deployments.

How We Work

What an engagement looks like

AI agent engagements have a different shape from other AI work. The 1-2 week Diagnose phase focuses on identifying which workflows are genuinely good agent candidates — structured, repeatable, with clear success criteria — versus which should stay manual or use simpler AI patterns. Output is a written plan including the security architecture, tool permission model, and audit logging design alongside the agent functionality.

For a typical custom GPT or Claude Project deployment, Deploy is 3-5 weeks. For MCP-based agent platforms, it's 6-12 weeks depending on tool integration complexity. Most businesses start with custom GPTs for quick wins, then graduate to MCP-based agents once the team has experienced what works and what doesn't.

Drive (ongoing) is essential for agents because the AI capabilities evolve rapidly — what was best-in-class six months ago is mid-tier today. Monthly retainer covers model updates, capability expansion, and ongoing tuning as the team learns from real usage.

Custom GPT / Claude Project

1-3 specialist GPTs

Quick wins for specific team roles. 3-5 weeks per GPT. Fixed price.

Internal agent platform

MCP-based + 3-6 agents

Coordinated agent set with structured tool access and human approval gates. 8-14 weeks.

Customer-facing agents

Voice or text

Production-grade customer-facing agents with conservative routing and human handoff. 6-12 weeks.

Real Engagement

How an AU SaaS deployed 4 specialist Claude Projects in 5 weeks

An Australian B2B SaaS company (~$8M ARR, 22 staff) wanted to give specialist team members AI assistance tuned to their roles — without ChatGPT Team-level data exposure or generic productivity-AI. They had Claude for Work licences but team adoption was low because the default Claude didn't know their product, customers, or processes.

We built 4 Claude Projects in parallel: a Sales Engineer assistant loaded with product documentation and prior customer questions; a Customer Success assistant loaded with onboarding playbooks and customer health context; a Proposal Writer assistant loaded with prior winning proposals and pricing logic; a Support Specialist assistant loaded with the product knowledge base and ticket-resolution patterns.

Within 5 weeks: all 4 Projects deployed and used daily by their respective teams. Estimated time saved across the 4 roles: ~30 hours/week. Customer-facing communication quality improved measurably as the assistants compressed routine drafting time, freeing the team for the harder customer relationship work.

Client identity withheld under engagement confidentiality. Outcomes and metrics accurate as deployed.

FAQ

Common questions about AI AI agents

Aren't AI agents the autonomous future of work?

The honest 2026 answer: not yet. Current-generation agents handle 3-5 step structured tasks reasonably well, struggle with anything requiring real judgment, and fail badly when something unexpected happens. The autonomous-everything vision exists in vendor demos and venture capital decks; the production reality is structured agents handling specific tasks with human oversight on consequential decisions. That's what we build. The vendors promising autonomous knowledge work in 2026 are mostly selling future capability they don't yet have.

What's the security risk profile?

Agents with broad tool access are higher risk than single-purpose AI tools because they can be tricked or manipulated. The OpenClaw incident (March 2026) demonstrated this clearly. We design with strict tool permissions (each agent has minimum necessary access), audit logging on every action, human approval gates on consequential operations, and structured failure modes. The right metaphor is 'AI as a constrained specialist with clear access boundaries' rather than 'AI as a general assistant with everything-access'.

Should we use Custom GPTs, Claude Projects, or MCP-based agents?

Depends on your situation. Custom GPTs and Claude Projects are faster to deploy and easier to maintain — but they lock you to OpenAI or Anthropic respectively. MCP-based custom agents are vendor-neutral but require more engineering work. For team-specific quick wins where you don't need portability, Custom GPTs or Claude Projects are fine. For workflows you expect to maintain for years or that need to work across multiple LLM providers, MCP is the right pick. We help you decide per workflow during Diagnose.

Can the agent actually take actions or just chat?

Both, depending on the design. Read-only conversational agents (Q&A bots, knowledge retrieval) are low-risk and can run without human approval. Agents that take consequential actions (sending external communication, modifying records, triggering payments) require structured human approval gates in our standard implementations. We can build fully autonomous agents for low-stakes internal tasks — but for anything customer-facing or financially material, the approval gate stays.

What does this cost?

Custom GPT / Claude Project deployments run AU$15-30k per assistant including the data preparation, prompt engineering, and team training. MCP-based agent platforms with multiple agents and tool integration run AU$60-130k over 8-14 weeks. Voice-based customer-facing agents (after-hours phone answering) run AU$25-50k for a single high-volume use case.

How do we know if our team is ready for agents?

Honestly, most teams should start with simpler AI workflows before adding agents — and we'll tell you upfront during Diagnose if we think you should. The teams that succeed with agents typically have already deployed at least 2-3 simpler AI workflows successfully (document drafting, email triage, knowledge retrieval) so they understand what AI does well and where it breaks. Agents work best as a maturity step, not an entry point. If you don't yet have AI workflows in production, we'll usually recommend starting with one of those before agents.

Talk to us about AI agents for your business

Free 30-minute Diagnose call. We'll look at where agent automation would genuinely help, identify the right pattern (Custom GPT, Claude Project, MCP-based agent), and tell you upfront whether your business is ready for agents or should start with simpler AI workflows.

Book a Diagnose call