June 16, 2026

Measuring what matters: Why productivity is the challenging part of ‘personal productivity AI’

Increased employee productivity is the promise of AI, but measuring it remains elusive. This article explains why effectiveness is a better metric than productivity and how organizations can turn AI usage data into improving GenAI outcomes.

“We are pro-AI because we want our staff to be more productive.” Makes sense, right? But do you know if that productivity is materializing?

“Productivity” is one of the most loaded words in economics. For employees engaged in repetitive process work, like customer service, you have process KPIs to track. For general knowledge work, however, you are left with no universal metric.

The answer requires confronting a problem economists have wrestled with for decades. And it explains why, at NROC Security, we took a different approach to the issue.

The productivity paradox

In 1987, economist Robert Solow made a now-famous observation: “You can see the computer age everywhere but in the productivity statistics.” Despite massive investment in computing technology throughout the 1970s and 80s, national productivity numbers were not moving.

Economists call this the productivity paradox. It was not that computers did not work. It was that the benefits took a long time to materialize — and they were difficult to measure even when they did.

Electric dynamos required rewiring of factories in the early 20th century. That delayed economic impact for 20–30 years. The internet was faster, but even there the results took a good 10–15 years, because there was a rewiring of entire businesses that was required. Homepages and early dot-com initiatives made headlines but did not deliver results.

GenAI is in that early phase. The models are capable. Metrics tracking access to the technology, even adoption, are like counting home pages and dot-com initiatives. They measure presence, not impact.

Knowledge work productivity is measured by effectiveness

Traditional productivity has a clean definition in manufacturing: output divided by input. Units per hour. Revenue per headcount. Defects per thousand.

Knowledge work destroyed that clean definition fifty years ago. Peter Drucker, who coined the term “knowledge worker,” argued that measuring their productivity required understanding the quality and relevance of output — not just the volume. The challenge is that quality and relevance mean completely different things depending on the job.

A lawyer being “more productive” with AI might mean fewer billable hours per case — which is bad for revenue, even if good for the client. A sales rep being more productive might mean shorter cycles, larger deal sizes, or higher win rates — three metrics that rarely move together. An engineer’s output might be features shipped, bugs fixed, or review speed, and experienced engineers will tell you these trade off badly against each other.

Effectiveness is job-agnostic in a way that productivity is not. You can measure it without knowing whether a lawyer, a sales rep, or an engineer is behind the keyboard. The question is always the same: is this person applying AI in a way that extracts meaningful value from the interaction?

The components of effectiveness

At the top level, this is a variant of the familiar quantity x quality or price x quantity frameworks. Is the technology used often enough and “well” enough to be effective?

Measurable vs. observable

Most components of effectiveness are measurable — and the rest can at least be sampled to gain insights. Frequency of use is a good example: times used per user and app can be obtained directly. But whether the app selection or lack of confidence is keeping non-users away requires organizational surveys or other means.

In theory, there's an infinite number of use cases. Skill levels and the distribution of generic tasks can be measured through prompt analytics. Whether the data component is additive to effectiveness is a matter of policy — and sampling what employees have actually uploaded into AI.

The economic literature on general purpose technologies — from dynamos to the internet to computing — consistently shows the same sequence: adoption comes first, workflow restructuring second, measurable impact third. Visibility into the levers helps drive actions that influence the outcome: the effective use of GenAI apps.

What comes next

In the next post in this series, we will look into the effectiveness data from NROC Security deployments. We will also show a thought experiment for how to aggregate this into an organizational metric for tracking progress over time.

This is Part 2 of the Measuring What Matters blog series.  Read Part 1 →

Join us live — Tuesday, June 23rd 

10:00 am PT  |  1:00 pm ET  |  6:00 pm GMT |  6:00 pm GMT | 8:00 pm EET 

Register for the webinar →

Get insights on boosting GenAI app adoption safely

Subscribe to NROC security blog

More blog posts

Measuring what matters: How to quantify effectiveness of personal productivity AI

Although AI adoption is growing, productivity impact remains limited. While 14% of employees use GenAI regularly, only 0.1% currently demonstrate the skill and frequency needed to drive meaningful productivity gains. The data underscores the importance of measuring GenAI usage, skill development, and business outcomes.
Governance
Productivity

Why NIS2 belongs in your AI governance and why "which plan" matters

Short answer: If your organisation falls under NIS2, every generative AI tool your employees use is part of your regulatory risk surface. The single most important thing to know is what AI is being used and on which plan — because a personal or free GenAI account handles your data very differently from a business plan, and that difference can turn everyday productivity into a compliance gap.
CISO
Governance