Published
- 1 min read
Apache Spark
Spark interview questions How to solve OOM in apache spark
- Increase Executor Memory: Adjust the executor memory in your Spark job configuration: —executor-memory 4G
- If the OOM happens on driver - —driver-memory 4G
- Tune memory fractions: Spark divides executor memory into storage and execution memory. Adjust these parameters: spark.memory.fraction=0.6 spark.memory.storageFraction=0.5
- Use more memory for computation if tasks are spilling: spark.memory.fraction=0.8
- Avoid wide Transformations spark.sql.shuffle.partitions=200, use effiecent operations reduceByKey, instead of groupByKey
- If the memory is limited : spill to disk
- Optimize partitions, way too many partitions reduce paritions, increase partitions coalese and repartition
- Broadcast smaller table
- Optimize garbage collection - spark.executor.extraJavaOptions=-XX:+UseG1GC -XX:InitiatingHeapOccupancyPercent=35
- Monitor and debug : Enable logs spark.eventLog.enable = true