Tune Performance for Iceberg Topics

This guide covers strategies for optimizing the performance of Iceberg topics in Redpanda, including improving downstream query performance, tuning the Iceberg translation pipeline, and monitoring translation throughput.

After reading this page, you will be able to:

Apply partitioning and compaction strategies to improve query performance
Choose appropriate lag target values for your workload
Identify translation performance signals using Iceberg metrics

Prerequisites

You must be familiar with how Iceberg topics work in Redpanda. See About Iceberg Topics.

Optimize query performance

Query engines read Parquet files from object storage to process Iceberg table data. Partitioning, compaction, and schema design affect how efficiently those reads perform.

Use custom partitioning

To improve query performance, consider implementing custom partitioning for the Iceberg topic. Use the redpanda.iceberg.partition.spec topic property to define the partitioning scheme:

# Create new topic with five topic partitions, replication factor 3, and custom table partitioning for Iceberg
rpk topic create <new-topic-name> -p5 -r3 -c redpanda.iceberg.mode=value_schema_id_prefix -c "redpanda.iceberg.partition.spec=(<partition-key1>, <partition-key2>, ...)"

Valid <partition-key> values include a source column name or a transformation of a column. The columns referenced can be Redpanda-defined (such as redpanda.timestamp) or user-defined based on a schema that you register for the topic. The Iceberg table stores records that share different partition key values in separate files based on this specification.

For example:

To partition the table by a single key, such as a column col1, use: redpanda.iceberg.partition.spec=(col1).
To partition by multiple columns, use a comma-separated list: redpanda.iceberg.partition.spec=(col1, col2).
To partition by the year of a timestamp column ts1, and a string column col1, use: redpanda.iceberg.partition.spec=(year(ts1), col1).

To learn more about how partitioning schemes can affect query performance, and for details on the partitioning specification such as allowed transforms, see the Apache Iceberg documentation.

Partition by columns that you frequently use in queries. Columns with relatively few unique values (low cardinality) are good candidates for partitioning.
If you must partition based on columns with high cardinality, for example timestamps, use Iceberg’s available transforms such as extracting the year, month, or day to avoid creating too many partitions. Too many partitions can be detrimental to performance because more files need to be scanned and managed.

Compact Iceberg tables

Over time, Iceberg translation can produce many small Parquet files, especially with low-throughput topics or short lag targets. Compaction merges small files into larger ones, reducing the number of metadata operations query engines must perform and improving read performance.

Automatic compaction: Some catalog and data platform services, such as AWS Glue and Databricks, automatically compact Iceberg tables.
Manual or scheduled compaction: Tools like Apache Spark can run compaction jobs on a schedule. This is useful if your catalog or platform does not compact automatically.

If you observe degraded read performance or a high number of small files, investigate whether your catalog or platform supports automatic compaction or schedule periodic compaction jobs.

Avoid high column count

A high column count or schema field count results in more overhead when translating topics to the Iceberg table format. Small message sizes can also increase CPU utilization. To minimize the performance impact on your cluster, keep to a low column count and large message size for Iceberg topics.

Tune translation performance

Translation is the process in which Redpanda converts topic data into Parquet files for the Iceberg table. Each round of translation processes one topic partition at a time.

Under typical conditions, Iceberg translation has the following performance characteristics:

Throughput: Approximately 5 MiB/s per core.
Flush threshold: 32 MiB. Each translation process uploads its on-disk data when accumulated data reaches this threshold. This is the primary control for Parquet file size, and is managed by Redpanda Cloud.
Lag target: Controlled by iceberg_target_lag_ms (default: 1 minute). Redpanda tries to commit all data produced to an Iceberg-enabled topic within this window.

The flush threshold and lag target together determine the size of the Parquet files written to object storage. Larger Parquet files generally improve downstream query performance by reducing the number of metadata operations query engines must perform.

Tune the lag target

In Redpanda Cloud, datalake_translator_flush_bytes is managed by Redpanda Cloud and is not user-tunable. To adjust the size of Parquet files written to object storage, increase the lag target. A larger lag target gives translators more time to accumulate data before committing, resulting in larger Parquet files with more records per file.

You can configure the lag target at the cluster level or per-topic:

Cluster-wide: edit iceberg_target_lag_ms in the Redpanda Cloud Console. For instructions, see Configure Cluster Properties.
Per-topic: set the redpanda.iceberg.target.lag.ms topic property. The topic property overrides the cluster default for that topic.

Increasing the lag target means Iceberg tables receive new data less frequently. Choose a lag value that balances file efficiency against how current your downstream data must be.

To check the current cluster-wide value:

rpk cluster config get iceberg_target_lag_ms

To check topic-level overrides:

rpk topic describe <topic-name> -c

Optimize message size

Redpanda has validated 32 MiB as the maximum recommended message size for Iceberg-enabled topics. With large messages, each Parquet file contains fewer records because the flush threshold is reached sooner. This can reduce the efficiency of analytical queries that need to scan many records.

If query latency is a concern and your workload produces large messages, consider:

Reducing individual message sizes if your data model allows it.
Increasing iceberg_target_lag_ms to produce Parquet files with more records per file. See Tune the lag target.

Size clusters for Iceberg workloads

When you enable Iceberg for any substantial workload and start translating topic data to the Iceberg format, you may see most of your cluster’s CPU utilization increase. If this additional workload overwhelms the brokers and causes the Iceberg table lag to exceed the configured target lag, Redpanda automatically increases the scheduling priority of Iceberg translation to help it catch up with incoming data. However, this does not substitute for adequate cluster resources.

You may need to increase the size of your Redpanda cluster to accommodate the additional workload. To ensure that your cluster is sized appropriately, contact the Redpanda Customer Success team.

Monitor translation performance

Use the following Iceberg metrics to understand whether translation is keeping pace with incoming data:

redpanda_iceberg_translation_raw_bytes_processed: Total raw bytes consumed for translation input. Use this to monitor input throughput and compare against the expected 5 MiB/s per core baseline.
redpanda_iceberg_translation_parquet_bytes_added: Total bytes written to Parquet files. Divide by redpanda_iceberg_translation_files_created to estimate the average file size produced by your workload.
redpanda_iceberg_translation_files_created: Number of Parquet files created. A high file creation rate relative to bytes added indicates many small files. Consider increasing iceberg_target_lag_ms.
redpanda_iceberg_translation_parquet_rows_added: Total rows written to Parquet files. Useful for understanding record-level throughput.
redpanda_iceberg_translation_translations_finished: Number of completed translator executions. A stalling or zero rate indicates translation has stopped.

For metrics related to DLQ files, invalid records, and catalog commit failures, see Troubleshooting metrics.

If translation consistently lags despite available CPU headroom, the workload may be partition-bound. Each core translates its assigned partitions independently, so distributing data across more partitions allows more cores to contribute to translation and can improve total throughput.

Was this helpful?

group Ask in the community

mail Share your feedback

group_add Make a contribution