DataWeave · 5 min read

DataWeave tutorial: the core functions every MuleSoft exam tests

By the MulePrep team · Updated June 2026

JSON / XML / CSV payload
DataWeave 2.0 script
New output structure

This DataWeave tutorial is the one page to bookmark before your MuleSoft exam. DataWeave is Mule's transformation language, and on both the MuleSoft Developer I and MuleSoft Developer II exams it is the single skill that shows up in the most questions. If you can read a script, reshape a payload, and reach for the right core function, you have removed a whole category of guessing from the exam. This guide builds DataWeave from the header down, then walks the four functions you will use every single day: map, reduce, filter, and groupBy.

Everything below is DataWeave 2.0, the version that ships with Mule 4 and the version the current exams test. Examples are runnable as written.

How a DataWeave script is structured: the header and body

Every DataWeave script has two parts separated by three dashes. Above the dashes is the header: directives that configure the transformation. Below is the body: a single expression that produces the output. There is no return statement, because DataWeave is a functional, expression-oriented language. The body is the result.

%dw 2.0
output application/json
---
{
  greeting: "hello",
  when: now()
}

%dw 2.0 declares the language version. output application/json tells the engine what format to emit; change it to application/xml or application/csv and the exact same body serialises differently. That decoupling of logic from output format is the whole point of DataWeave: you describe the shape once, and the writer handles JSON, XML, CSV, or Java for you. The header can also declare an input directive, named var values, reusable fun functions, and import statements, but the version-and-output pair is the minimum.

Transform Message: where DataWeave runs in a Mule flow

In Mule 4, DataWeave lives in the Transform Message component. When you drop a Transform Message into a flow and open it, you are editing exactly the header-plus-body script above. The component reads the current message — its payload, attributes, and variables — and replaces the payload with whatever your script returns.

Three context objects are always in scope inside the script, and knowing them cold is worth easy exam marks:

  • payload — the body of the current Mule message, already parsed into a DataWeave value you can navigate.
  • attributes — metadata about the message, such as HTTP headers, query parameters, and the method on an inbound request.
  • vars — flow variables you set earlier with a Set Variable component, accessed as vars.myVariable.

A "dataweave transform message" question usually hands you an input payload and asks what the output is, or which expression produces a target shape. The trick is to read the output directive first (it decides the serialisation), then trace the body against the payload you were given. Navigation uses dots for objects (payload.customer.id) and a map for arrays, which is the next stop.

JSON / XML / CSV payload
DataWeave 2.0 script
New output structure

map: transform every item in an array

map is the workhorse. It takes an array and returns a new array of the same length, where each element has been run through a transformation. It never mutates the input; functional purity is a recurring exam theme.

The lambda receives two arguments: the current item and its index. You can name them, or use the implicit $ (item) and $$ (index).

%dw 2.0
output application/json
var orders = [
  { sku: "A1", qty: 2, price: 10 },
  { sku: "B2", qty: 1, price: 25 }
]
---
orders map (order, index) -> {
  line: index + 1,
  sku: order.sku,
  total: order.qty * order.price
}

This returns a two-element array, each object reshaped with a computed total. A "dataweave map function example" on the exam typically tests two things: that the output array length equals the input length, and that you remember the lambda's second argument is the index, not a count. If you only need the item, orders map { sku: $.sku } using $ is idiomatic and fine.

filter: keep only the items that match

filter also takes an array and returns an array, but the lambda returns a boolean: true keeps the item, false drops it. The result is the same items, never reshaped — just fewer of them. Reach for filter when the question is "which elements survive", and reach for map when it is "what does each element become".

%dw 2.0
output application/json
var orders = [
  { sku: "A1", qty: 2 },
  { sku: "B2", qty: 0 },
  { sku: "C3", qty: 5 }
]
---
orders filter ((order) -> order.qty > 0)

That keeps only the two orders with a positive quantity. A common "dataweave filter function" trap pairs it with map: payload filter ($.active) map ($.name) first drops inactive records, then projects the survivors to their names. Read left to right — filter runs before map — and the output is an array of names, not objects.

reduce: collapse an array into a single value

reduce is the function people fear, and it is genuinely the one most likely to separate a pass from a fail. It walks an array and accumulates the items into a single result — a sum, a count, a concatenated string, or even a rebuilt object. The lambda takes two arguments: the current item ($) and the accumulator ($$), which carries the running result between iterations.

%dw 2.0
output application/json
var quantities = [2, 1, 5]
---
quantities reduce ((item, total = 0) -> total + item)

This returns 8. Two details earn marks. First, the accumulator is the second argument — the reverse of map, where $$ is the index. Mixing those up is the single most common DataWeave mistake. Second, the = 0 after total is the seed: the accumulator's starting value. Without a seed, DataWeave uses the first array element as the initial accumulator and starts iterating from the second, which changes the result and breaks on an empty array. When a "dataweave reduce function" question gives an empty input, the seed is almost always the point being tested.

groupBy: turn a flat array into keyed buckets

groupBy reorganises an array into an object whose keys are computed from each item, and whose values are arrays of the items that produced that key. It is how you go from a flat list to a lookup table.

%dw 2.0
output application/json
var orders = [
  { region: "EU", sku: "A1" },
  { region: "US", sku: "B2" },
  { region: "EU", sku: "C3" }
]
---
orders groupBy ((order) -> order.region)

The result is { "EU": [ ...two orders... ], "US": [ ...one order... ] }. The lambda returns the grouping key; identical keys collect into the same bucket. A "dataweave groupby example" on the exam will test that the values are always arrays, even when a group has one element, and that the keys are coerced to strings in the output object. groupBy pairs naturally with mapObject or pluck when you then need to summarise each bucket — for instance, counting orders per region.

How DataWeave is tested, and how to drill it

The functions above compose: a realistic transformation is often filter to drop noise, map to reshape, then reduce or groupBy to summarise. Most exam scenarios are exactly that pipeline applied to an order, a customer list, or an API response. If you can trace one of these scripts and predict its output without running it, you are operating at the level both developer exams demand.

DataWeave depth is also what most cleanly separates Developer I from Developer II — Level 2 leans harder on scripts that combine these functions under realistic payloads, a jump we break down in MCD Level 1 to Level 2. The fastest way to internalise all of this is timed practice against payloads you have not seen. Our free 10-question demo includes DataWeave reading questions in exactly this style, so you can check whether reduce's accumulator and map's index have actually stuck.

Frequently asked questions

What is DataWeave used for in MuleSoft?
DataWeave is Mule 4's expression and transformation language. It reads a message's payload, attributes, and variables and produces a new output - reshaping JSON, XML, CSV, or Java without you writing format-specific parsing code. It runs inside the Transform Message component and is the single most-tested skill on both MuleSoft developer exams.
What is the difference between map, filter, and reduce in DataWeave?
All three take an array. map returns a new array of the same length with each item transformed. filter returns the same items but fewer of them, keeping only those whose lambda returns true. reduce collapses the whole array into a single value such as a sum or a rebuilt object using an accumulator.
Why does the reduce accumulator order trip people up?
In map the lambda's second argument is the index ($$), but in reduce the second argument is the accumulator ($$) and the first is the item. Reversing them is the most common DataWeave mistake. Always provide a seed, like (item, total = 0), so the accumulator has a starting value and the script still works on an empty array.
What does groupBy return in DataWeave?
groupBy turns a flat array into an object keyed by a value computed from each item. Every key maps to an array of the items that produced it - even when a group has a single element, the value is still an array. Object keys are coerced to strings. It is how you build a lookup table from a list before summarising each bucket.
How much DataWeave do I need for the MuleSoft Developer exams?
Enough to read a script and predict its output without running it. Both Developer I and Developer II expect you to navigate payload, attributes, and vars and to apply map, filter, reduce, and groupBy. Developer II leans harder on scripts that combine these functions under realistic payloads, so practice tracing pipelines rather than memorising single functions.

Independent study resource - not affiliated with, endorsed by, or connected to MuleSoft or Salesforce; their trademarks belong to their owners. All practice questions are original.