Operations & quality

Budget & Costs

This section is about the costs your AI assistant incurs (for language-model requests, phone calls, messages, and so on), and how to set a spending cap so that you never spend more than you want. Prezio tracks every cent across all channels and can automatically stop the assistant as soon as a daily limit you have set is reached.

Important first: The actual input field for the spending cap ("Daily spend cap" / "Daily budget") as well as the detailed cost reports (spend reports) are configured in Prezio as administrator functions. If you don't see these fields in your dashboard, your account doesn't have admin rights – in that case your Prezio contact sets the cap for you. Simply ask for the daily value you want (e.g. "10 USD per day per assistant"). The way it works is identical in both cases and fully described below.


Setting the daily budget (spending cap)

The daily budget is a hard cap in US dollars per assistant and per rolling 24-hour period. Once it is reached, the assistant stops spending money on its own – across all channels. This reliably protects you from unexpectedly high bills, for example if someone hits your web chat in an automated loop.

Step 1: Open the assistant's overview. Go to your assistant and open the "Overview" tab. If necessary, switch to advanced mode ("Advanced") in the top right, because the model and budget settings are in the "AI model" card, which is only visible in advanced mode.

Step 2: Enter the daily budget. In the "AI model" card you will find (provided it is enabled for your account) the "Daily spend cap (USD)" or "Daily budget (USD)" field.

  • What it does: The total of all of the assistant's costs over the last 24 hours is checked against this value. As soon as it reaches or exceeds the mark, the assistant stops (see the next section). It is not just web chat that counts, but everything: chat, WhatsApp, SMS, email, phone calls (voice) as well as chargeable background actions such as website crawling, menu recognition by photo (OCR) and the retrieval test.
  • Input: A number in US dollars, adjustable in steps of 0.50, minimum 0. Example placeholder in the field: "e.g. 10.00 — leave empty for no limit".
  • Default: empty (no limit). A newly created assistant therefore has no cap – costs keep running without limit until you set one yourself.
  • Recommendation: Set a value in any case. A good starting point for a small business is 5–15 USD per day. In the first few weeks, watch your actual daily spend (see the cost report below) and set the limit a little above that with some buffer, so that normal operation isn't unnecessarily blocked.

Step 3: Save. Scroll down and click the persistently visible save button at the bottom. The cap is only active after saving. (An empty field saves "no limit".)


What happens when the budget is reached

Important: Prezio does not lock your assistant permanently, and it sends no more chargeable replies while the limit is exceeded. The behaviour differs by channel, but is always designed so that your customers never get to see the internal budget figures:

  • Web chat (embedded widget / portal): Requests are rejected before the language-model call. The customer receives a neutral message along the lines of "This assistant is temporarily unavailable. Please try again later." (technically HTTP 429 with a retry hint of 1 hour). The specific budget amount is never shown externally.
  • WhatsApp, SMS, email: The incoming message is accepted, but no AI reply is generated – the assistant simply stays silent instead of spending money. A failed contact is therefore a missing reply, not an error message to the customer.
  • Phone (voice): Instead of a (chargeable) AI conversation, a "temporarily unavailable" announcement is played. If the budget check itself fails (e.g. a brief technical glitch), the call is not blocked ("fail open") – when in doubt, telephony takes priority.
  • Chargeable configuration actions (website crawl in the knowledge base, menu recognition by photo, the retrieval test in the AI debug area): When the budget is reached, these are paused with a note, so that setup, too, doesn't secretly blow through your daily limit.

As soon as the 24-hour period rolls forward and older costs drop out of the window, the assistant resumes operation on its own – you don't have to reset anything.

Note: Phone calls (voice) are the biggest cost driver in most accounts. The daily budget deliberately counts them in – there is no pure "chat-only limit".


The cost report (spend report)

You can see how much your assistant actually costs in the analytics area (in Prezio under the admin overview, "Admin overview"). This area, too, is an operator/admin view; if it isn't shown to you, please request the figures from your Prezio contact. There you will find:

Step 1: Choose a period. At the top, you switch between the "24h", "7d", "30d" and "90d" windows. Default: 30 days. All the metrics below relate to the chosen window.

Step 2: Read the metric tiles. The top bar shows: - "Total cost" – total costs in the period in USD. - "Requests" – number of chargeable calls. - "Active agents" – how many assistants incurred any costs at all. - "Distinct users" – number of accounts involved. - "Input tokens" / "Output tokens" – the language model's consumed input/output tokens (abbreviated in k/M). Tokens are the billing unit of the AI providers.

Step 3: Check the trend and breakdown. - "Spend over time" – a trend curve of daily costs over the chosen window. Ideal for spotting outlier days. - "By channel" – costs broken down by channel (e.g. voice, whatsapp, sms, email, embed, portal), as bars with a percentage share and a USD amount. This lets you see immediately whether, for example, telephony makes up the lion's share. - "By model" – costs per AI model used (e.g. gpt-4o, gpt-4o-mini). Phone calls appear as a combined entry (e.g. vapi-voice-call), since a call bundles several underlying services.

Step 4: See the biggest cost drivers and limits. The "Top 10 agents by spend" table lists the most expensive assistants with columns Agent, Owner (owner email), Requests, Tokens (in/out), Cost and – particularly relevant for this chapter – "Cap": this shows the daily budget that has been set (e.g. $10/d) or "—" if no limit is set. Via "See all →" you reach the full list; a separate analysis additionally rolls up the costs per owner account.

Step 5 (optional): Export the data. The analytics can be downloaded as CSV (user, assistant and audit lists). The assistant CSV contains, per row, things like name, owner_email, daily_budget_usd, requests and cost_usd – handy for your own bookkeeping or a spreadsheet.


How prices are determined

You don't have to enter any prices – Prezio calculates fully automatically and with current rates:

  • Token prices (language model) are dynamic. They come from a continuously maintained price catalogue (LiteLLM) and are updated every 6 hours; in the event of network problems, a bundled, last-known state is used. So there are no hard-coded prices – when a provider changes its rates, Prezio follows suit.
  • All provider costs are captured. Every chargeable external call writes a line into the cost tracking: language-model calls (chat and image analysis), text embeddings (embeddings for the knowledge base), phone calls (incl. a breakdown into language model/speech output/speech recognition/transport), SMS and WhatsApp via Twilio, incoming and outgoing emails, as well as the monthly rental for a provisioned phone number. It is exactly this captured total that forms the basis both for the cost report and for the budget check.
  • Not included in this total are, by design, a few items that don't fit the "per call" scheme – in particular Stripe payment fees (these are deducted directly from payouts; see your Stripe account statement) as well as pure storage costs for uploaded files. These therefore appear neither in the daily budget nor in the spend report.

Tips & pitfalls

  • Without a set limit there is no protection. The default is "empty = no limit". Actively set a cap – otherwise a faulty integration or an abuse attempt can incur unlimited costs.
  • The limit is a hard boundary, not a "soft warning". When it is reached, the assistant is muted, not merely warned. Set the limit generously enough that normal peak times (e.g. a full lunch service) aren't choked off at lunchtime.
  • The window is rolling (the last 24 h), not the calendar day. The limit doesn't "reset" at midnight, but frees up gradually as soon as older costs drop out of the 24-hour window.
  • Phone calls eat the budget fastest. If your limit is reached surprisingly early, check the "By channel" breakdown first – usually voice is the driver.
  • The amount is in US dollars (USD), regardless of your billing currency. When setting it, factor in the exchange rate roughly if necessary.
  • Fields not visible? Both the budget input field and the spend report are admin/operator-side in Prezio. If you don't see them, that's not a fault – contact your Prezio representative for the limit and the analytics.
  • Customers never see figures. Whatever the channel: when the budget is reached, your customers only get a neutral "temporarily unavailable" reply or none at all – never the budget amount or the current spend.