Foundations of Algorithmic Decision Making: The Five Regions of Decision Complexity

We make decisions all the time, but not all decisions are alike. Consider the following situations:

Reasoning: About uncertainty and objectives at a single point in time.
Sequential: Making a sequence of decisions while observing outcomes as we proceed.
Learning: Acting in an environment where the model is unknown and must be learned through interaction.
Imperfect Information: Where the full state of the environment is not observable.
Multi-Agent: Making decisions in environments shared with multiple agents.

Before diving into examples, let’s take a bird’s‑eye view of the landscape. Think of these five regions as different colors on a canvas. When faced with a problem, we can look toward the region whose “color” best matches the nature of the uncertainty we’re dealing with.

Regions of Decision Making

Region 1 — Single‑step uncertainty

Here, uncertainty is represented using probability distributions. The key questions are:

How do we construct these models?
How do we use them to make inferences?
How do we learn their parameters and structure?

This is where utility theory and decision networks come into play.

Region 2 — Sequential decisions

In this region, we must reason about future sequences of actions and observations. Outcomes are uncertain, and the agent must plan ahead. The formulation depends heavily on:

What the agent knows about the model
How observable the environment is

The standard mathematical framework here is the Markov Decision Process (MDP).

Region 3 — Learning the model

Here, the dynamics and rewards are not known in advance. The agent must:

Balance exploration (trying new actions) with exploitation (using what it has learned).
Assign credit to earlier decisions that lead to later rewards.
Generalize from limited experience.

This is the domain of Reinforcement Learning (RL).

Region 4 — Imperfect state information

In this region, the agent cannot observe the true state directly. Instead, it receives observations that are probabilistically related to the underlying state. The agent maintains a belief distribution over possible states and updates it as it acts.

A policy maps beliefs to actions, and this framework is captured by the Partially Observable Markov Decision Process (POMDP).

Region 5 — Multiple agents

Here, several agents act simultaneously in a shared environment. The uncertainty comes from the state, the actions of other agents, and the stochastic transitions. Relevant models include:

Markov Games (MG)
Partially Observable Markov Games (POMG)
Decentralized POMDPs (Dec‑POMDPs)

As the next step in our journey, we will explore each of these regions with concrete examples. Stay tuned!

#AI #DecisionMaking #Algorithms #ReinforcementLearning #MachineLearning #MDP #POMDP #RL #MultiAgentSystems

Search This Blog

World's Better Half