WFM guideDigital channels

Contact centre live chat WFM

Q: How do you calculate staffing for a live chat queue?

Chat staffing uses a concurrency model. The formula is: Agents required = (Chat contacts per hour ÷ Concurrency ratio) × Shrinkage multiplier. Concurrency ratio is the average number of simultaneous chats an agent can handle while maintaining the response time target. Example: if 120 chats arrive per hour and the concurrency ratio is 3 (each agent handles 3 concurrent chats), the productive agents required = 120 ÷ 3 = 40 agents. Apply shrinkage: 40 ÷ 0.75 = 53.3, round up to 54 agents on-shift. The concurrency ratio must be validated empirically — a ratio that is appropriate for simple transactional chat contacts (ratio of 3–4) is too high for complex advisory chat contacts (ratio of 1.5–2). Apply it uniformly across contact types and the model will understaff complex contacts and overstaff simple ones. The right approach is to calculate a weighted concurrency ratio based on the mix of contact types, or to staff complex and simple chat contacts separately using different concurrency assumptions.

Live chat is neither voice nor email. It is synchronous but concurrent — agents handle two, three, or four chats simultaneously. Erlang C assumes one contact per agent. The chat staffing model requires a concurrency ratio as its central variable.

Why concurrency makes chat different

Not voice

Erlang C assumes one contact per agent at a time. In a chat environment with a concurrency ratio of 3, each agent serves 3 customers simultaneously. Applying Erlang C to chat produces a staffing requirement 2–3× too high — or too low if the model is incorrectly inverted.

Not email

Email is asynchronous — agents work at their own pace through a backlog and response delay is acceptable. Chat is synchronous — the customer is waiting for the agent's reply in real time. A slow response to a chat is experienced as being ignored. The throughput model does not account for the real-time response expectation.

A concurrency model

The correct model calculates: how many active simultaneous chat contacts can one agent manage while still responding to each within the target response time? The answer depends on chat complexity, agent typing speed, and the pattern of customer reply lag. Set the ratio correctly and the model is accurate; set it wrong and every staffing calculation that follows will be wrong.

Concurrency ratio guide

1 chat per agent

Occupancy: Low (equivalent to voice occupancy)

Response time risk

None

Quality risk

None

Suitable for

High-complexity or sensitive contacts (complaints, medical, financial advice) where agent attention must not be divided.

2 chats per agent

Occupancy: Moderate

Response time risk

Low — agent can respond to each chat within 60–90 seconds in most intervals

Quality risk

Low — agent switching cost is manageable for most contact types

Suitable for

Standard advisory contacts with moderate complexity. The most common default for contact centres new to chat.

3 chats per agent

Occupancy: High

Response time risk

Moderate — response time deteriorates under volume spikes when all 3 chats are active simultaneously

Quality risk

Moderate — agents may miss nuance in individual chats when managing three simultaneously

Suitable for

Simple transactional contacts (order status, password resets, FAQ queries) where the agent response time is less critical than throughput.

4+ chats per agent

Occupancy: Very high — agent is effectively fully occupied at all times

Response time risk

High — response time breaches likely in any interval above average volume

Quality risk

High — error rate increases; agents miss cues; chat interactions become formulaic. Customer satisfaction deteriorates significantly.

Suitable for

Bot-augmented chats where the agent is supervising automated responses rather than typing them. Not appropriate for human-only chat at this ratio.

The chat staffing calculation

Worked example — staffing a chat queue at 09:00–09:30:

Inputs

Forecast: 90 chat contacts in the 09:00–09:30 interval; average chat duration (ACD): 8 minutes; concurrency ratio: 3; shrinkage: 25%

Step 1 — contact-minutes in interval

90 contacts × 8 minutes = 720 contact-minutes of chat work

Step 2 — agent-minutes available (without shrinkage)

1 interval = 30 minutes. At concurrency 3: each agent can handle 3 × 30 = 90 contact-minutes of concurrent work per interval

Step 3 — productive agents required

720 contact-minutes ÷ 90 contact-minutes per agent = 8.0 agents on chat

Step 4 — on-shift agents (with shrinkage)

8.0 ÷ 0.75 = 10.7 → 11 agents on-shift

Step 5 — validate against response time

At 90 contacts, 8 productive agents, concurrency 3: peak active chats = 90 × (8 ÷ 30) = 24 simultaneous contacts. At 8 agents and concurrency 3: maximum simultaneous chats = 24. The model is in balance. If volume spikes to 120 contacts in the interval, peak active chats = 32, which exceeds 24 capacity. Response time will deteriorate.

Chat-specific intraday management

Monitor active concurrent chats, not queue length

In a voice queue, the primary intraday metric is contacts in queue. In chat, the equivalent is active chats per agent. If agents are at their concurrency cap (all on 3 of a 3-chat ratio), new contacts queue and response time in existing chats starts to deteriorate. Monitor active chat saturation rate by agent group, not total queue length.

Chat contacts have variable duration — plan for long-tail

Chat ACD varies more than voice ACD because the customer typing speed is a variable outside the agent's control. A chat scheduled for 8 minutes average may have 20% of contacts running 15+ minutes because the customer is typing slowly or distracted. These long-duration chats tie up agent slots without proportional workload. The concurrency model should use a distribution of chat durations, not just the average.

Break management is more disruptive for chat than voice

When a voice agent takes a break, they finish the current call and become unavailable. The impact is immediate and clean. When a chat agent starts a break, they may have 2–3 active chats that cannot be abandoned — they must be completed before the break starts. In practice, agents must begin a soft close 5–10 minutes before their scheduled break time. The WFM break plan must account for this wrap time, or breaks will consistently run late and schedules will drift.

Live chat WFM questions

How do you calculate staffing for a live chat queue?

Use the concurrency model: Agents required = (Chat contacts per hour ÷ Concurrency ratio) × Shrinkage multiplier. Example: 120 contacts per hour at concurrency ratio 3 = 40 productive agents needed. Apply shrinkage at 25%: 40 ÷ 0.75 = 53.3, round up to 54 on-shift. The concurrency ratio (chats per agent per moment) is the critical variable — validate it empirically from historical data rather than using a default. Complex advisory chats typically support a ratio of 1.5–2; simple transactional chats may support 3–4. Apply a uniform ratio across mixed contact types and the model will systematically misstaff.