Scheduled — this post will go live on June 10, 2026 at 9:00 AM
All posts
MaxDiffDeciphersurvey programmingbest-worst scalingXML

How to Code MaxDiff Surveys in Decipher

Complete guide to MaxDiff survey programming in Decipher — balanced item assignment, loop XML structure, Python exec row filtering, and common pitfalls to avoid.

David Thor·June 10, 2026·16 min read
Decipher XML code for a MaxDiff survey using the Indices Method

MaxDiff (best-worst scaling) is one of the more technically demanding question types in Decipher survey programming. The platform has no native MaxDiff widget — programmers have to build the behavior from scratch using XML structure, Python <exec> blocks, and some careful attribute choices. This post walks through the full implementation using the Indices Method — a pattern that produces balanced item assignment, per-set row filtering, and clean data output ready for hierarchical Bayes (HB) analysis.

By the end you will have working XML for an 8-item, 6-set MaxDiff with balanced item rotation, per-set row filtering, and correct variable naming for downstream analysis.

Decipher MaxDiff Architecture: How the Indices Method Works

Because Decipher has no native MaxDiff question type, programmers have to assemble the behavior from primitives: a loop, a radio question, and Python exec blocks. The Indices Method is the standard pattern for doing this — the name refers to how item selections are recorded by their index in the full item list rather than their position within a given set.

The implementation here uses three pieces:

  1. A hidden <text> variable stores a JSON structure representing which items appear in each set, generated once per respondent at survey start.
  2. A <loop> iterates once per set, with each iteration containing a <radio adim="cols"> question.
  3. A Python <exec> block inside the loop reads the assignment for the current set and disables rows that are not in it, then reorders the remaining rows.

All rows are always declared in the XML. The Python exec is what makes only 4 of 8 visible per set. This approach differs from the pattern in the official Forsta documentation, which relies on an external design file — we cover that divergence at the end.

Survey Design Parameters

Before writing a line of XML, nail down these four numbers:

ParameterWhat it controlsTypical value
Total itemsSize of the item pool6–30
pickNItems shown per set4–6
roundsNumber of sets per respondent6–8
StrategyRotation method"balanced"

A balanced design aims for each item to appear roughly the same number of times across all sets. With 8 items, 4 per set, and 6 sets, each item appears in 3 of the 6 sets — the minimum for stable individual-level estimates in hierarchical Bayes modeling.

Avoid setting pickN above roughly 40% of your total pool. At 5 items per set out of 8 total, respondents are seeing the same items repeatedly and the discrimination between items weakens.

Part 1: Helper Functions in <exec when="init">

The <exec when="init"> block runs once when the survey initializes. This is where you define the Python functions that handle assignment and row filtering. Paste it near the top of your survey, before any questions.

<exec when="init">
def q_json(value):
    return repr(value)
 
def q_parse_json(raw):
    if raw is None or raw == "":
        return []
    try:
        return eval(raw, {"__builtins__": None}, {})
    except Exception:
        return []
 
def q_assignment_slot(raw, slot):
    assigned = q_parse_json(raw)
    if slot < 0 or slot >= len(assigned):
        return []
    return assigned[slot]
 
def q_shuffle(items):
    out = list(items)
    shuffle(out)
    return out
 
def q_assign_balanced(pool, pick_n, rounds):
    values = [item.get("value", item) for item in pool]
    deck = []
    assigned = []
    for round_index in range(rounds):
        if len(deck) < pick_n:
            remaining = list(deck)
            refill = list(values)
            shuffle(refill)
            safe = [x for x in refill if x not in remaining]
            deck.extend(safe + [x for x in refill if x in remaining])
        assigned.append(deck[:pick_n])
        deck = deck[pick_n:]
    return assigned
 
def q_sample_assign(pool, pick_n, rounds, strategy):
    if strategy in ("balanced", "sparse", "rotational"):
        return q_assign_balanced(pool, pick_n, rounds)
    return [q_shuffle(pool)[:pick_n] for round_index in range(rounds)]
 
def q_setup_rows_by_labels(question, labels):
    wanted = [str(label) for label in labels]
    row_index = dict((str(r.o.label), r.index) for r in question.rows)
    for r in question.rows:
        r.disabled = str(r.o.label) not in wanted
    question.rows.order = [row_index[label] for label in wanted if label in row_index]
 
def q_setup_rows_by_value_map(question, values, value_to_row):
    labels = []
    for value in values:
        key = str(value)
        if key in value_to_row:
            labels.append(value_to_row[key])
    q_setup_rows_by_labels(question, labels)
</exec>

How the balanced assignment algorithm works

q_assign_balanced works like a card deck. It shuffles all item values and deals pick_n cards per round. When the deck runs low, it refills from a fresh shuffle — but it avoids repeating items still in the remaining deck from the previous pass (the safe/remaining split). The result is that across 6 sets, each item appears roughly the same number of times, and no two items always co-occur. That even co-occurrence is essential for MaxDiff analysis.

q_setup_rows_by_value_map handles the per-set filtering. It maps the current set's assigned item values to row labels, then calls q_setup_rows_by_labels, which sets r.disabled = True on every out-of-set row and reorders question.rows.order to match the assigned sequence.

Part 2: Hidden Variable and Set Generation

Right after your helper functions, declare the hidden variable and the exec that populates it.

<!-- Hidden variable stores the per-respondent item assignment -->
<text label="claimSets" optional="1" where="execute">
  <title>HIDDEN: claimSets</title>
</text>
 
<!-- Generate balanced sets once when the respondent enters the survey -->
<exec>
pool = [
  {"value": 1, "weight": 1}, {"value": 2, "weight": 1},
  {"value": 3, "weight": 1}, {"value": 4, "weight": 1},
  {"value": 5, "weight": 1}, {"value": 6, "weight": 1},
  {"value": 7, "weight": 1}, {"value": 8, "weight": 1}
]
claimSets.val = q_json(q_sample_assign(pool, 4, 6, "balanced"))
</exec>
 
<suspend/>

where="execute" keeps the variable out of the rendered survey but saves it to the data file. Do not skip this variable. Without claimSets in your data, you will not know which items were shown in which set for any given respondent, making post-fieldwork MaxDiff analysis impossible.

The <suspend/> after the exec commits the assignment before the respondent advances into the loop.

Part 3: The MaxDiff Loop

This is the full loop structure — the block with the radio question, all item rows, CSS overrides for the grid layout, and the looprows that drive iteration.

<loop label="Q1_md_loop" randomizeChildren="0" vars="task">
  <title>Q1 - MaxDiff Loop</title>
 
  <block label="Q1_md_block" randomize="1">
    <radio label="Q1_[loopvar: task]"
           adim="cols"
           grouping="cols"
           shuffle="rows"
           ss:questionClassNames="Q1_maxdiff"
           unique="1">
 
      <title>Which message is most/least motivating?</title>
 
      <!-- Activate only the items assigned to this set -->
      <exec>
Q1_values = q_assignment_slot(claimSets.val, int([loopvar: task]) - 1)
q_setup_rows_by_value_map(
    Q1_[loopvar: task],
    Q1_values,
    {"1": "item1", "2": "item2", "3": "item3", "4": "item4",
     "5": "item5", "6": "item6", "7": "item7", "8": "item8"}
)
      </exec>
 
      <col label="best">Most Motivating</col>
      <col label="worst">Least Motivating</col>
 
      <!-- All 8 items declared; the exec above disables non-assigned ones -->
      <row label="item1">Claim 1</row>
      <row label="item2">Claim 2</row>
      <row label="item3">Claim 3</row>
      <row label="item4">Claim 4</row>
      <row label="item5">Claim 5</row>
      <row label="item6">Claim 6</row>
      <row label="item7">Claim 7</row>
      <row label="item8">Claim 8</row>
 
      <!-- CSS overrides for MaxDiff grid layout -->
      <style mode="before" name="question.header"><![CDATA[
<style type="text/css">
.Q1_maxdiff tr.maxdiff-header-legend {
    background-color: transparent;
    border-bottom: 2px solid #d9d9d9;
}
.Q1_maxdiff tr.maxdiff-header-legend th.legend {
    background-color: transparent;
    border: none;
}
.Q1_maxdiff tr.maxdiff-row td.element {
    border-left: none;
    border-right: none;
    border-top: none;
    border-bottom: 1px solid #d9d9d9;
    text-align: center;
}
.Q1_maxdiff tr.maxdiff-row th.row-legend {
    background-color: transparent;
    border-left: none;
    border-right: none;
    border-top: none;
    border-bottom: 1px solid #d9d9d9;
    text-align: center;
}
</style>
      ]]></style>
 
      <style name="question.top-legend"><![CDATA[
\@if ec.simpleList
    $(legends)
\@else
    <$(tag) class="maxdiff-header-legend row row-col-legends row-col-legends-top ${"mobile-top-row-legend " if mobileOnly else ""}colCount-$(colCount)">
        ${"%s%s" % (legends.split("</th>")[0],"</th>")}
        $(left)
        ${"%s%s" % (legends.split("</th>")[1],"</th>")}
    </$(tag)>
    \@if not simple
  </tbody>
  <tbody>
    \@endif
\@endif
      ]]></style>
 
      <style name="question.row"><![CDATA[
\@if ec.simpleList
    $(elements)
\@else
    <$(tag) class="maxdiff-row row row-elements $(style) colCount-$(colCount)">
        ${"%s%s" % (elements.split("</td>")[0],"</td>")}
        $(left)
        ${"%s%s" % (elements.split("</td>")[1],"</td>")}
    </$(tag)>
\@endif
      ]]></style>
 
    </radio>
  </block>
 
  <!-- One looprow per set -->
  <looprow label="1"><loopvar name="task">1</loopvar></looprow>
  <looprow label="2"><loopvar name="task">2</loopvar></looprow>
  <looprow label="3"><loopvar name="task">3</loopvar></looprow>
  <looprow label="4"><loopvar name="task">4</loopvar></looprow>
  <looprow label="5"><loopvar name="task">5</loopvar></looprow>
  <looprow label="6"><loopvar name="task">6</loopvar></looprow>
</loop>

Key Attributes Explained

adim="cols" and grouping="cols"

These two attributes must appear together. They tell Decipher that the response dimensions are columns (best / worst), not rows. Without adim="cols", the question renders as a standard grid where rows are the response options. With it, Decipher flips the axis: respondents pick a column value for each row, which is the MaxDiff interaction model you want.

unique="1"

Prevents a respondent from selecting the same item as both best and worst. Without this, nothing in the platform stops that from happening, and you will collect invalid responses that break your MaxDiff model.

shuffle="rows"

Randomizes item order within each set presentation. This is separate from the balanced assignment — the assignment controls which items appear; shuffle="rows" controls the order they appear in. Always include it to reduce primacy and recency position bias.

randomizeChildren="0" on the loop

Do not randomize set order. The q_assign_balanced function already handles even distribution across sets. If you let Decipher shuffle loop iterations, you disrupt the deck-dealing logic and can end up with uneven item exposure. Set this to 0.

randomize="1" on the block

This controls randomization of sibling elements inside the block, not loop iteration order. It is useful when you add an anchor question (such as a per-set importance rating) alongside the MaxDiff radio — the block will shuffle those two elements relative to each other. It has no effect when the block contains only the radio.

[loopvar: task] syntax

Decipher's loop variable interpolation. The radio label Q1_[loopvar: task] resolves to Q1_1 on the first iteration, Q1_2 on the second, and so on. This produces the per-set variable structure in your data: Q1_1_best, Q1_1_worst, Q1_2_best, Q1_2_worst, etc.

The same interpolation is used inside the exec block — Q1_[loopvar: task] in Python is resolved by Decipher before execution, which is how the exec can reference the correct question object for the current set.

Why all rows are declared even when only 4 are shown

Row declarations in a Decipher radio question are fixed at the XML level — you cannot conditionally include or exclude them at runtime. The workaround: declare all rows, and use Python to disable those not in the current set. r.disabled = True hides a row from rendering and excludes it from validation. question.rows.order controls the display sequence of the remaining active rows.

Data Output Structure

After fieldwork, your data file contains:

  • claimSets — the JSON assignment string for each respondent, e.g., [[3, 7, 1, 5], [2, 8, 4, 6], ...]
  • Q1_1_best, Q1_1_worst — item value selected as best/worst in set 1
  • Q1_2_best, Q1_2_worst — item value selected in set 2
  • ... through Q1_6_best, Q1_6_worst

To run a MaxDiff hierarchical Bayes model, you need to reconstruct the choice task for each respondent: which items were shown (from claimSets), which was selected as best, and which as worst. This is why claimSets is not optional — it is the mapping between your response data and your experimental design.

Design Recommendations

Items per set. 4–6 is the standard range. Below 4, the best-worst tradeoff loses discriminative power. Above 6, cognitive load climbs and satisficing increases.

Number of sets. 6–8 sets gives stable individual-level estimates for HB modeling. Fewer sets are acceptable for aggregate-level analysis only.

Item appearance frequency. With 8 items, 4 per set, and 6 sets, each item appears in 3 sets on average. That is a practical minimum for HB. For larger pools (15+ items), increase rounds until each item appears at least 3 times.

Anchor questions. If you need a per-set importance or willingness-to-pay rating alongside the MaxDiff task, add a <number> or <float> element inside the <block> alongside the radio. Set randomize="1" on the block to control presentation order. Anchor data saves as Q1_md_block_1_youranchorlabel, Q1_md_block_2_youranchorlabel, etc.

Common Decipher MaxDiff Pitfalls

Missing adim="cols". The question renders as a standard grid instead of the two-column best/worst layout. Check this first if your preview looks wrong.

Missing unique="1". Respondents can select the same item as both best and worst. This produces logically invalid responses that analysis software will error on or silently drop.

randomizeChildren="1" on the loop. Set order gets shuffled, which can create uneven item exposure depending on how early terminations interact with the deck-dealing algorithm. Always set this to 0.

Not saving claimSets. If you omit the hidden variable or set where incorrectly, the assignment does not appear in your data export. You will have response values but no way to reconstruct which items were shown. There is no recovery from this after fieldwork closes.

pickN too high relative to pool size. At 6 items per set out of 8 total, respondents see 75% of items each round. Repetition weakens discrimination and the balanced sampling algorithm has very little room to spread items evenly. Keep pickN at or below 40% of your pool.

Using eval() with q_parse_json. The function uses eval with __builtins__: None intentionally — Decipher stores values as Python repr strings, not JSON, so json.loads does not work. The builtins restriction prevents code injection from a stored value.

Adapting This Template

To adapt this code to your study, change five things:

  1. Pool values. Replace the {"value": N, "weight": 1} entries with your actual item count.
  2. pickN and rounds. Change the 4 and 6 arguments to q_sample_assign to match your design.
  3. Row labels and text. Replace item1item8 labels and Claim 1Claim 8 with your actual items.
  4. The value-to-label map in the exec. The dict {"1": "item1", "2": "item2", ...} must map your pool values to your row labels exactly.
  5. Column labels. Replace Most Motivating / Least Motivating with appropriate best/worst language for your study.

The helper functions in Part 1 do not need to change between studies.

If you want to go deeper on MaxDiff experimental design — specifically why survey software uses runtime balanced assignment instead of a classic BIBD — see The MaxDiff Design Gap.

How This Differs from the Official Forsta Docs

Forsta's own documentation for the Indices Method (see Creating a MaxDiff Question — Indices Method) uses a different setup worth understanding before you decide which approach fits your project.

The key difference is a separate design.dat file — a tab-delimited file defining the exact item sets per version and task, generated offline using a tool like Sawtooth Software and uploaded to the project directory. The XML template opens this file at init via setupMaxDiffFile("design.dat"), builds a Python dict keyed by version and task ("v1_t1", "v1_t2", etc.), and uses a <quota> element plus p.markers to assign each respondent to one of the pre-computed versions. The row-disabling function (setupMaxDiffItemsI) then looks up that respondent's version-task key to get their item list.

The approach here diverges from that pattern deliberately, for two reasons.

All business logic lives in one file. With the official approach, the survey's behavior depends on an external artifact generated outside the project. When something goes wrong — items not showing correctly, a row not disabling — you are debugging across a file boundary, and the design file itself was produced by an external tool and is not readable inline. With the self-contained approach here, everything is in the XML: the pool definition, the assignment algorithm, the row-to-value mapping, and the loop. A programmer can read the file top to bottom and trace the full execution path without switching contexts.

Runtime generation is easier to maintain than a static design file. The official design.dat is a snapshot of a fixed experimental design. If the client adds an item, changes the number of tasks, or reorders the pool, the file has to be regenerated externally and re-uploaded. The q_assign_balanced function here takes pool, pick_n, and rounds as arguments in the XML itself — a design change is a one-line edit, and each respondent automatically gets a fresh balanced assignment.

There is a real tradeoff: the official approach assigns respondents to fixed pre-computed versions, which means you can audit the exact design matrix before fielding and verify pairwise co-occurrence properties offline. The runtime approach produces a unique near-balanced design per respondent, which averages out well across a full sample but cannot be inspected as a single design object. For most commercial MaxDiff work the runtime approach is the right call. If your methodology requires a fixed, auditable design — for example, a study following a strict BIBD or one where all respondents must share identical set compositions — the official design file pattern is more appropriate.


Frequently Asked Questions

How many items can a Decipher MaxDiff handle? There is no hard platform limit. In practice, studies with up to 30 items are common. Beyond that, the number of sets required to give each item sufficient exposure makes the survey too long. For 20+ items, plan for at least 8–10 sets and keep pickN at 4–5.

What is the difference between "survey programming" in Decipher and Sawtooth? Decipher uses XML with embedded Python <exec> blocks; Sawtooth uses a proprietary scripting language. The Indices Method described here is the Decipher equivalent of Sawtooth's built-in MaxDiff module — you get the same balanced assignment and per-set filtering, but you have to wire it up yourself rather than configuring a wizard.

Can I use this code in Forsta (the new Decipher)? Yes. Forsta is the company that acquired Decipher, and the survey programming platform is the same. All XML, Python exec syntax, and attribute names covered here work identically in the current Forsta/Decipher platform.

How do I analyze Decipher MaxDiff data after fieldwork? Export claimSets alongside the Q1_N_best and Q1_N_worst variables. The claimSets value per respondent is the JSON assignment array — it tells you which items were shown in each set. Feed that into your HB software (Sawtooth CBC/HB, R's logitr, or a custom model) alongside the best/worst selections to estimate individual-level utilities.


How Questra Handles This Automatically

The XML in this post is exactly what Questra generates when you program a MaxDiff survey. You describe the study in a simple definition — your items, how many to show per set, how many sets — and Questra's survey compiler produces the complete Decipher XML: the <exec when="init"> helper functions, the hidden assignment variable, the balanced sampling exec, the loop structure, the per-set row filtering, and the CSS layout overrides. The same code you just read through is what lands in your survey file.

This is the core of what Questra does for survey programming more broadly: researchers and programmers describe what a survey should do, and the platform handles the translation to platform-specific XML. For Decipher, that means generating the Indices Method pattern for MaxDiff, the correct loop structures for conjoint exercises, proper skip logic, and all the other boilerplate that experienced programmers have memorized but still have to type. The output is auditable, version-controlled XML that you own — not a black box.

If you are programming MaxDiff surveys in Decipher regularly, or working across multiple platforms and need consistent output, learn more about Questra at questra.ai.

About the author

DT
David ThorFounder & CEO

Has spent 15 years building AI products and tools that make teams more productive — from Confirm.io (acq. by Facebook) to Architect.io. Holds two patents in AI-powered document authentication. Started Questra after watching his wife Emily, a market research consultant, deal with long wait times between survey drafts and revisions just to get studies into field.