The newsroom

𝗔𝗪𝗦 𝗷𝘂𝘀𝘁 𝗿𝗼𝗹𝗹𝗲𝗱 𝗼𝘂𝘁 𝘁𝗵𝗿𝗲𝗲 𝗽𝗿𝗶𝗰𝗶𝗻𝗴 𝘁𝗶𝗲𝗿𝘀 𝗳𝗼𝗿 𝗚𝗲𝗻𝗔𝗜 𝗶𝗻𝗳𝗲𝗿𝗲𝗻𝗰𝗲 𝗼𝗻 𝗕𝗲𝗱𝗿𝗼𝗰𝗸: 𝗺𝗼𝗿𝗲 𝗳𝗹𝗲𝘅𝗶𝗯𝗶𝗹𝗶𝘁𝘆, 𝗯𝘂𝘁 𝗮𝗹𝘀𝗼 𝗺𝗼𝗿𝗲 𝗼𝗽𝗮𝗰𝗶𝘁𝘆

AWS Bedrock adds Priority, Standard, Flex. Priority: lower latency, ~60–90% pricier. Standard: predictable baseline. Flex: ~50% of Standard, slower. Anthropic stays Standard. Choose by latency vs. cost; benchmark and classify workloads for optimisation discipline.

The first guest post from Jean (and for this site too)

AWS Bedrock is now offered in Priority, Standard, and Flex. The idea is simple, but the pricing clues are scattered across several pages, so I had to reverse engineer the real differences. Here is the clear version.

𝗣𝗿𝗶𝗼𝗿𝗶𝘁𝘆: higher performance. Lower latency. Noticeably higher cost. Priority pricing is usually 60 to 90% above Standard.

𝗦𝘁𝗮𝗻𝗱𝗮𝗿𝗱: the baseline tier with predictable cost and predictable performance, comparable to today's on-demand pricing and performance.

𝗙𝗹𝗲𝘅: slowest but cheapest. Flex pricing is roughly 50% of Standard.

These percentages are not published by AWS. They come from comparing per-token prices for the models that currently support the new tiers. Today this includes 𝘖𝘱𝘦𝘯𝘈𝘐 𝘖𝘚𝘚 𝘮𝘰𝘥𝘦𝘭𝘴, 𝘘𝘸𝘦𝘯, 𝘋𝘦𝘦𝘱𝘚𝘦𝘦𝘬, 𝘢𝘯𝘥 𝘈𝘮𝘢𝘻𝘰𝘯 𝘕𝘰𝘷𝘢 𝘗𝘳𝘰 𝘢𝘯𝘥 𝘗𝘳𝘦𝘮𝘪𝘦𝘳. 𝘛𝘩𝘦𝘴𝘦 𝘵𝘪𝘦𝘳𝘴 𝘥𝘰 𝘯𝘰𝘵 𝘢𝘱𝘱𝘭𝘺 𝘵𝘰 𝘈𝘯𝘵𝘩𝘳𝘰𝘱𝘪𝘤 𝘮𝘰𝘥𝘦𝘭𝘴, 𝘸𝘩𝘪𝘤𝘩 𝘳𝘦𝘮𝘢𝘪𝘯 𝘪𝘯 𝘚𝘵𝘢𝘯𝘥𝘢𝘳𝘥 𝘰𝘯𝘭𝘺.

𝗛𝗼𝘄 𝗱𝗼 𝘆𝗼𝘂 𝗰𝗵𝗼𝗼𝘀𝗲 𝘁𝗵𝗲 𝗿𝗶𝗴𝗵𝘁 𝘁𝗶𝗲𝗿?

• Priority when latency matters and the user is waiting

• Standard when you need stable, predictable performance

• Flex when speed is irrelevant and cost efficiency is the objective

𝗦𝘂𝗺𝗺𝗮𝗿𝘆 𝗳𝗼𝗿 𝗽𝗿𝗮𝗰𝘁𝗶𝘁𝗶𝗼𝗻𝗲𝗿𝘀

Flexibility increased, but clarity did not. AWS gives the knobs, but not the numbers. 𝗜𝘁 𝗶𝘀 𝗮𝗻 𝗼𝗽𝗽𝗼𝗿𝘁𝘂𝗻𝗶𝘁𝘆 𝗳𝗼𝗿 𝗼𝗽𝘁𝗶𝗺𝗶𝘀𝗮𝘁𝗶𝗼𝗻, provided that you are ready to benchmark cost and latency and classify workloads with more discipline.

Read next

Style guides for AI

Notes from Tech Show London 2026

CEO intent to FinOps decisions

Comments ()

Read next

Comments ( )

Comments ()