Run price experiments to optimize in-app product prices

You can run price experiments in Play Console to diversify your pricing strategy for emerging markets or test different price points in your core markets to drive more revenue.

Overview

Price experiments allow you to run A/B tests and use the results to confidently and accurately adjust your app's prices according to purchasing power in different markets globally. You can also use price experiments to test different price points against a control price in your core markets to drive more revenue. Setting the optimal price sets for your in-app products in different countries/regions ensures you're applying the best pricing strategy for your app to continue driving revenue upwards while still attracting new users. As this feature is based on in-app product pricing and sales, you must be a merchant developer to use it.

Prepare and run a price experiment

The sections below provide guidance and important information about preparing and running price experiments and understanding their results.

Before you start

Expand and read each of the sections below before you start setting up a price experiment in Play Console.

Prerequisites
Important information

General information

  • Experiments are at the app-level only; you cannot run experiments across multiple apps.
  • The maximum length an experiment can run for is six months, after which prices revert to original price. Prices will also revert to the original price after 14 days of becoming statistically significant.
  • You can test a maximum of 1000 in-app products in a single experiment.
  • You cannot run the same experiment with overlapping in-app products and overlapping countries until at least 30 days after the end of the original experiment.
  • You cannot pause experiments. You can only run them or stop them entirely. You cannot restart a stopped experiment.

Country/region limits

  • You can only run one experiment at a time in any given country/region. For example, if you're already running an experiment in Norway, you can't run another experiment there on different products unless you stop the first experiment and create another.
  • The tested price must be within the defined price ranges for in-app purchases. If a price percentage is outside the defined price range for that country/region as part of an experiment, the price percentage will automatically stop at the boundary value.
  • You can have a maximum of two price variants and a control for each experiment.
  • Experiments are only implemented in the selected country/region and not in its associated location overrides. For example, if you run an experiment in France, this excludes associated French territories (French Guiana, French Polynesia, Guadeloupe, Martinique, Mayotte, New Caledonia, Réunion, Saint Barthélemy, Saint Martin, Saint Pierre and Miquelon, Wallis and Futuna).

Changes to in-app products

  • You cannot change the price of an in-app product featured in a running experiment. To change the price, you must stop the experiment. Note the following:
    • The only exceptions are when you run two experiments and apply the results of the first experiment. Alternatively, if your in-app product is linked to a pricing template, you can change in-app product prices only for non-experimented countries.
    • If you configure a new in-app product during an active experiment, it'll be omitted from the experiment.

Running experiments and applying results

  • You can apply experiment prices at any time but we recommend waiting until your results are statistically significant and to only apply price changes that perform better than your control group.
  • You cannot add additional countries/regions or products to an experiment which is currently running.
  • If one or more of your products are linked to a pricing template, applying the prices will update your experiment's country/region prices and unlink these products from the template(s).
Experiments terminology

Experiments heavily feature the use of statistical terminology. If you are unfamiliar with these terms, you may find it helpful to refer to the glossary below.

Term Definition Notes
Confidence interval

A range of values expressing the uncertainty around whether the result is statistically significant or not.

This is a key signal with which to interpret the experiment outcome. Statistical significance is determined when the confidence interval doesn't intersect with 0.

This may not show up for the first few days due to low data volumes.

Confidence level

The probability for which the observed difference between variant and control is true.

Example: A confidence level of 90% would mean that 10% of the time the observed difference is likely due to chance. In other words, there is a less than 10% chance that the data being tested could have occurred if there’s no difference between control and variant.
  • We use jackknife (a statistical tool that allows estimation of the variance through resampling) to calculate confidence intervals and then apply mixture sequential probability testing to control inflated false positive rate from continuous monitoring.
  • The default confidence level is 90%, but you can choose between 70% and 99%. Once it’s been selected at experiment setup, it won’t change over the course of the experiment.
  • You can't change your confidence level once you start your experiment.
  • The most common confidence levels are 95%, 90%, and 99%; higher confidence levels require more data and, by extension, mean that experiment results are more likely to be reliable.
  • 50% would mean a 50:50 chance that the result is correct (similar to the throw of a coin).
Control The original testing variable (containing the originally priced in-app purchase) in an experiment arm. The control price is seen by a portion of the experiment audience defined during setup, and all users outside of the experiment.
Experiment outcome The result of your experiment. Possible experiment outcomes are listed in the following section.
False positive When a negative event is incorrectly categorized as positive.

When we detect that the spend difference between the control(s) and test is statistically significant (positive) but the spend difference between the control(s) and test is actually not statistically significant (negative), it's considered a false positive.

The false positive rate is the ratio of the number of false positives and the total number of actual negative events.

In-app products

Items for which users are charged on a one-time basis.

In-app products can include items like virtual goods (for example, game levels or potions) and premium services within your app on Google Play.

You can choose one or many different in-app products, with a maximum quantity of 2,000. In general, more data means less time to detect a difference between the control and variant(s), if there is a difference.

Experimenting with a price change across many products may decrease the potential of cannibalization.

Minimum detectable effect (MDE)

An input you choose when setting up the experiment that indicates the level of improvement you want your experiment to detect. You can think of this as deciding how sensitive you want your experiment to be, thus affecting the estimated time until statistical significance. Once you choose your MDE, you'll see an estimate of the time it'll take to determine a statistically significant result based on historical data of your selection.

For example, a smaller MDE means a more sensitive experiment which leads to a longer estimated time to statistical significance calculation.

  • The default MDE is 30%, but you can choose 5% increments starting from 5% up to 50%.
  • You can't change your MDE once you start your experiment.
  • The MDE only impacts the estimated time to statistically significant calculation based on historical data or when we determine that an experiment variant(s) and control are the same. For more details, see possible experiment outcomes.
  • If you're testing new products with no historical data, the estimated time to statistical significance won't be accurate.

Novelty effect A phenomenon in which, users may prefer or try out a new feature (such as a new product or price point), even if the feature is not better or more appealing than the original. In this case, the uplift usually phases out over time. Experiments that conclude quickly may be subject to novelty effects. We recommend using the estimated weeks until statistical significance calculations to consider this effect.
Statistical significance In an A/B test, this is the determination of whether or not the difference between the control and variant is real or due to chance.

If the confidence interval doesn't include the value of zero in the effect, we consider this a statistically significant result. In our case, the effect is the difference between control and variant spend.

You can view some visual examples of confidence intervals in the Frequently asked questions section.

Variant A variation of the original testing variable for an experiment arm. The variant is either a price increase or a price decrease on the originally priced in-app purchase.

Set up an experiment

When setting up an experiment, we recommend you include all interchangeable in-app products in your app (such as in-game currency) in the experiment to limit cannibalization.

Before you start: Low data volumes

If you're working with low data volumes, you likely won’t have enough data to draw a conclusion from the experiment. For example, if you're running an experiment with a new product, our estimated experiment length and warnings may not be accurate. In this case, you may see a warning in Play Console that a very low data volume is predicted within your chosen time frame. If you get a low data volume message, you may not have enough data to run an experiment or you may be able to fine-tune your experiment parameters. Here are some suggested actions to troubleshoot low data volume issues.

Conversely, if you're running a longer experiment, it’s important to consider that results may be impacted by economic factors like global currency fluctuations over time.

Part 1: Add details

To add your experiment details, do the following:

  1. Open Play Console and go to the Price experiments page (Monetize with Play > Price experiments).
  2. Click Create experiment.
  3. In the "Add details" section, enter an experiment name and a short description of your experiment (optional).
    • Note: Your experiment name and short description aren't seen by users.
  4. Select the countries/regions you want to experiment with.
    • Important: Price experiments are only implemented in the selected country/region and not in its associated locations. For example, if you select France, your price experiments will only be implemented in France and not in its associated territories like French Guiana.
  5. Using the Products drop-down, choose the in-app products you want to experiment with. The drop-down lists the available in-app product names, IDs, and price.
    • Important: If you choose an in-app product that's linked to a pricing template, the pricing template will be locked for the duration of the price experiment.
  6. Select a start date. Your experiment will begin at 00:00 PT on the date you select. You can schedule your experiment to launch in the future.
  7. Click Next to continue setting up your experiment by adding variants.

Based on your experiment setup input, if we predict your data volumes will be too low to reach statistical significance, you'll see a warning. In this case, we recommend making adjustments to ensure you gather enough data to reach statistical significance. To view possible next steps, see Recommended actions for low data volumes.

Recommended actions for low data volumes

The following recommendations will help to change the data volumes required to reach statistical significance so that the experiment duration is shortened.

Experiment setup step

Recommended action
Part 1: Add details Increase the number of countries/regions in the experiment.
Part 1: Add details Test across all in-app products or increase the number of in-app products.
Part 2: Add variants Decrease the number of variants from two to one.
Part 2: Add variants

Experiment with decreasing your price by a larger percentage (lower price).

Note: This will not affect the initial estimation but will help during the live experiment.
Part 3: Manage settings Increase audience size per variant (for example, 50% control and 50% treatment).
Part 3: Manage settings

Increase the minimum detectable effect.

Note: This only affects experiments in which the variant(s) and the control performed the same.

Part 3: Manage settings Decrease the confidence level.

Part 2: Add variants

In the "Add variants" section, you can add and remove price variants. Your price experiment must have at least one variant. To add variants:

  1. Using the drop-down, choose Increase price or Decrease price, and enter the percentage you want to apply to your choice. Note the following:
    • You must enter a whole number.
    • If you select Decrease price, you must enter a value between 1 and 99.
    • If you select Increase price, you must enter a value between 1 and 999.
  2. After adding a variant, your control price range and variant price range are listed on the page. You can click View product prices for more detail, including product names and IDs, location overrides (if applicable), and tax information. Here you can see the actual percentage price changes after using the exchange rate and country-specific pricing patterns.
  3. Click + Add another variant and repeat the previous steps if you want to include multiple variants in your experiment. The control price range and variant price range are also listed on the page.
  4. Click Next to continue setting up your experiment by fine-tuning your experiment settings.

Part 3: Manage settings

In the "Manage settings" section, you can set your targeting parameters to fine-tune your experiment. To manage your experiment settings:

  1. Enter your experiment audience. This is the percentage of users who'll be in the experiment. These users will be split equally across your variant(s) and control. Note that users outside of the experiment will also see the original (control) price but won't be included in the experiment analysis.
    • Note: You must enter a whole number between 1 and 100.
  2. Enter your confidence level. Decreasing the confidence level will increase the likelihood of a false positive, but it also will decrease the experiment's runtime.
  3. Enter a minimum detectable effect. This indicates the level of improvement you want your experiment to detect. You can think of this as deciding how sensitive you want your experiment to be. This will adjust your estimated time until statistical significance calculation.
  4. Refine your targeting parameters to fine-tune your experiment. You can see the estimated time until statistical significance calculation as a guideline based on your setup.

Part 4: Launch your experiment

You are now ready to launch your experiment. During the experiment, prices will change for selected in-app products and countries regions. You can end your experiment at any time.

  1. Before launching your experiment, confirm that you understand the following:
    • The experiment may affect your app’s revenue. Note that without a statistically significant result, any negative or positive revenue result may not be meaningful, so we recommend waiting until you reach statistical significance.
    • You won’t be able to edit prices for the selected in-app products while the experiment is live.
    • Applying the results of an experiment will affect all included countries/regions and in-app products.
  2. Click Confirm and launch.

Optional: End your experiment

You can end your experiment at any time.

  1. Open Play Console and go to the Price experiments page (Monetize with Play > Price experiments).
  2. Open the experiment you want to end.
  3. Click End experiment.

After you end your experiment, experiment prices will revert to the original price for the relevant countries/regions and in-app products. The experiment and analysis will no longer be active.

If you have scheduled a future experiment, you can also cancel that by following the previous steps. Experiments that are canceled within 24 hours of their scheduled start date occasionally go live temporarily.

View experiment results and apply variants

It's important that you're familiar with different possible experiment outcomes and what they may mean for your app and its pricing strategy. Read this section carefully before applying your results.

View and analyze your experiment results

When your results are considered to be statistically significant, they'll be made available on your experiment's analysis page and you'll be notified via an Inbox message.

View your result

Visit your experiment to view the result and analysis. Go to the Price experiments page (Monetize with Play > Price experiments) and click the right arrow next to your experiment to view the experiment analysis page.

The outcome is displayed near the top of the page. Expand the Possible experiment outcomes section to view the different outcomes and what they mean. Underneath the experiment outcome, you can view a short description explaining the result. For example, you might see, "Variant [X] performed the best" above a short explanation of the experiment's result, which might say something like " Variant [X] achieved the greatest revenue." In cases where a variant has matched or outperformed a control, you can click Apply variant [X] to apply the variant and update your in-app product pricing accordingly. To learn more, see Apply variants.

Note: Experiments end automatically 14 days after a result is determined, or you can click End experiment to end it immediately.

Tip: You can apply experiment prices at any time, but we recommend waiting until after your results are statistically significant and to only apply price changes that perform better than your control group.

View supporting data

Below the outcome, a table shows the data that the statistical result is based on. The table lists the variant(s), revenue, revenue versus control (and revenue versus control as a percentage), and the confidence interval for your variant(s). If you want to understand the different metrics on your experiment's analysis page, expand the Metric definitions section.

Below the table in the "Supporting data" section, you can view more granular data. The chart displays revenue versus control information to the latest available date as default. In this view, the shaded area represents the confidence interval; the chart allows you to view how the confidence interval changed through time. You can adjust the selected duration and dates using the duration and date filters at the top right of the chart.

You can also display the following metrics using the metric filter at the top left of the chart: Revenue, Orders, Buyers, Buyer ratio, ARRPU (these metrics are also described in Metric definitions). If you choose another metric, the chart won't display a confidence interval, as this is based on revenue.

To view more product-level details, you can also export your experiment results as a CSV file. You can view the fields, format, and examples for the CSV export by expanding the Metric definitions (CSV export) section.

Tip: You can click View experiment setup to view the experiment setup details, which you can also filter by variant and by country/region. This can be helpful to quickly remind yourself of the experiment parameters.

Apply variants

Before applying your variant, note the following:

  • If one or more of your in-app products are linked to a pricing template, applying the variant will update your experiment's country/region prices and unlink these in-app products from the template(s).
  • You can only apply the variant to all in-app products and countries/regions in your experiment. We calculate statistical significance across the whole setup and therefore cannot guarantee the same outcome and effect if only part of the setup countries/regions and in-app products were applied.

If your variant(s) matched or outperformed your control, you can update your pricing by applying the variant:

  1. Open Play Console and go to the Price experiments page (Monetize with Play> Price experiments).
  2. Click Apply variant next to the variant you want to apply.

Your app's pricing will be updated as configured in the variant. Your experiment will end automatically 14 days after the result was determined, or you can click End experiment to end it immediately.

Possible experiment outcomes

There are multiple possible outcomes for an experiment. Your experiment outcome determines what your next steps should be.

Experiment outcome Meaning Notes/recommendations
Variant [X] performed best Variant [X] has become statistically significant by producing a meaningful result with the largest increase in revenue. Apply variant [X] , as it significantly outperformed your control and the other variant(s).
Both variants performed better than the control Both variants achieved greater revenue than the control. Review the results and decide which variant to apply. You decide which variant to apply.
Variant(s) and the control performed the same No variant achieved greater revenue than the control. Your experiment has collected enough data and yields a draw between variant(s) and the control.
The control performed best The control achieved greater revenue than both variants. This suggests that your current price points for the experiment countries and product are already optimized. Continue to use the control price for the experiment products and countries.
More data needed Your experiment is in progress. More data is needed to determine a statistically significant result. Go to recommended actions for low data volumes to determine next steps.
Inconclusive result Your experiment has been stopped prematurely or has reached the maximum experiment duration of six months. More data is needed to determine a statistically significant outcome. Try running a new experiment with a different setup. Go to recommended actions for low data volumes to determine next steps. This may be an indication that buyers are indifferent about the in-app product prices in the experiment countries/regions.
Metric definitions

This table lists the metrics from your experiment's analysis page.

Metric Definition
All products revenue The gross revenue from experiment users who purchased in-app products in and outside the experiment over the specified time period.
Average revenue per paying user (ARPPU) The total gross revenue of in-app products divided by the number of unique buyers (users who make at least one purchase of in-app products from in and outside of the experiment) in the specified time period for the experiment. This metric will help you understand the value of buyers to your business.
Buyer ratio (28-day)

The percentage of monthly active users who made at least one in-app product purchase during the experiment, including in-app products from outside of the experiment.

Note: This is a key metric for understanding buyer conversion and increasing the breadth of your paying user base.

Buyers Unique users who purchase at least one in-app product at the experimented price over the specified time period.
New installer revenue Gross revenue from users who installed the app for the first time on any device after the experiment start date and saw the experimented price for the first time. Note that some users opt out of sharing this data with Google.
Orders The number of experimented in-app product purchases made during the experiment over the specified time period.
Revenue Gross revenue from experiment users who purchased in-app products in the experiment over the specified time period.
Metric definitions (CSV export)

This table lists the fields, format, and examples for metric definitions exported as a CSV file.

Field

Format Examples and notes
Date String

Mar 23, 2023

Date of the orders based on the Pacific Time Zone (in MMM DD, YYYY format).

SKU ID String

treasure_chest_for_new_users

Developer-specified unique ID of the in-app product.

Product Title String

coins, monthly subscription, and so on.

Developer-specified name of the in-app product.

Country String

BR, US, FR, and and so on.

Unique country code for relevant in-app product metrics.

Experiment Arm String

CONTROL, A, B

Experiment arm for relevant in-app product metrics.

Developer Currency

String

USD, EUR, THB, and so on.

Currency for which the orders were converted. This is the local currency you're paid in.

Revenue Numeric

2794.60

Total gross revenue for the specified in-app product (given the date, country and experiment arm).

New Installer Revenue Numeric

577.20

Gross revenue from users who installed the app for the first time on any device after the experiment start date and saw the price for the first time (given the specified in-app product, given the date, country and experiment arm). Note that some users opt out of sharing this data with Google.

Orders Numeric

240

Total orders for the specified in-app product (given the date, country, and experiment arm).

Buyers Numeric

197

Total buyers for the specified product (given the date, country and experiment arm). Buyers are users who purchased the specific in-app product during the experiment.

Frequently asked questions

Collapse all Expand all

What countries/regions can I run experiments in?

Countries/regions are eligible for price experiments only if they meet the following criteria:

  • Countries/regions you've already rolled out a release in (including on internal test tracks).
  • Countries/regions that support local currency.
Why can't I see any data on my experiment's analysis page?

If data volumes are very low, no data is available. Additionally, some data displayed on your experiment's analysis page may have a delay of up to seven days.

Can I use in-app products that are linked to pricing templates?

Yes. Note that if you run an experiment featuring an in-app product linked to a pricing template, the experimented country/region prices are locked for the duration of the experiment. However, you can still change prices for other countries/regions.

If you choose to apply the new prices at the end of the experiment, all experimented in-app products are unlinked from the associated pricing templates and the pricing template will remain unchanged. If you want to apply the experiment results without unlinking your in-app products from the pricing templates, go straight to the Pricing templates page (Setup > Pricing templates) and update the prices there.

To learn more, see pricing templates.

Can I test prices that are much lower or higher than the current set limits per country?

No; experiment prices can only be tested within the minimum and maximum price ranges defined per country/region. When entering a price change during the experiment setup, if the price change is at either limit, it'll automatically assume the minimum or maximum price. To review our list of price ranges and currencies allowed by country, see Supported locations for distribution to Google Play users.

Do I need to be on Google Play Billing Library 5 to run price experiments?

No; if you're integrated with any version of Google Play's billing system, you can run price experiments.

How can I see if a user made a purchase in the variant instead of the control?

All experiment orders are visible in your downloadable monthly financial reports (estimated sales report and earnings report). You can see which experimented in-app products are bought at the control and variant prices for the countries/regions you're running your price experiment in.

Can I use other A/B testing and price experiments at the same time?

We recommend that you don't run any concurrent A/B experiments to ensure meaningful results. For example, Firebase works independently from price experiments in Play Console and any interference with specific in-app products may affect the reliability of experiment results.

What method do you use to determine statistical significance?

We use jackknife (a statistical tool that allows an estimation of the variance through resampling) to calculate confidence intervals and then apply mixture sequential probability testing to control an inflated false positive rate from continuous monitoring. Statistical significance is determined when the confidence interval doesn't intersect with zero.

What permissions do I need for using the price experiments tool?

You must have the Manage store presence and View financial data permissions to use price experiments.

Why has my experiment stopped?

The experiment stops automatically 14 days after it reaches statistical significance or at the maximum experiment length of six months. We don't stop the experiment at the estimated duration given at the experiment setup; instead, we stop it after a statistically significant difference has been detected, with 14 days given to apply the new pricing before prices revert back to the original.

Is it possible that a user (on the same account and using the same device) could see different daily prices after the experiment has started?

No, this isn't possible. A user will only see one price for the duration of the experiment.

Can I add more countries/regions to a live experiment?

No, this isn't possible because the experiment has already started collecting results with the parameters you initially selected. You can either run a second experiment in parallel with the additional countries/regions selected, or you can end the current experiment and set up a new experiment. Note that if there are overlapping in-app products and countries/regions, you'll need to wait 30 days before you can set up a new experiment.

Why am I unable to replicate the same result as my price experiments?

Statistical significance is calculated across all countries/regions combined with experimented products as opposed to independently for each combination. If you’re interested in the price adjustment in a particular country/region, we recommend only selecting that country/region in the experiment setup.

Can I still apply a variant after the experiment stops or is ended?

Yes, you can apply a variant even after prices have reverted back to the original.

Why do some experiments produce an inconclusive result?

If your experiment's in-app products don’t generate revenue that’s significantly different to your control or there isn’t enough data, the result is considered to be inconclusive. For this outcome, it's possible that buyers are indifferent to the two prices.

Is it generally good practice to run an experiment featuring one in-app product at a time or multiple in-app products?

In general, a larger sample size gives greater power. If the single in-app product you plan to test is one of the most popular products with many buyers, then testing that in-app product alone may be worthwhile. Otherwise, we encourage testing multiple in-app products together to increase the power of your experiment.

If you have multiple in-app products that are interchangeable to some degree (for example, USD 0.99 for 60 in-app currency units and USD 4.99 for 300 in-app currency units), then we encourage you to test both in-app products together to avoid cannibalization.

Should I run experiments with one or two variants?

We generally recommend one-variant experiments with your originally-priced in-app product(s) as the control and your discounted in-app product(s) as the variant. A two-variant experiment has your originally-priced in-app product(s) as the control and the same in-app product(s) at two different price points. Results from such experiments can be difficult to interpret, and two-variant experiments take more time than single-variant experiments to reach statistical significance.

How should I interpret my experiment's confidence interval?

If the confidence interval doesn't include the value of zero in the effect, we consider this a statistically significant result. In our case, the effect is the difference between control and variant spend.

Here's an example of how your confidence interval might look with a conclusive positive result:

CI_conclusive_positive

Here's an example of how your confidence interval might look with a conclusive negative result:

CI_conclusive_negative

Here's an example of how your confidence interval might look with an inconclusive result that needs more data:

CI_not_conclusive

Does Google provide developers any protection against pricing arbitrage?

Yes, we continuously monitor user location spoofing and take measures to mitigate the impact on our developers and their products.

Related content

Was this helpful?

How can we improve it?
true
Search
Clear search
Close search
Main menu
16586295481721225433
true
Search Help Center
true
true
true
true
true
92637
false
false