You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I created a custom payload for merging records on my hudi table with record level index and then tried to upsert values in the table. The upsert operation was successful for a single record as well as batches of records less than 50k in size but as I tried with 50k records ( or more ) I ran into an UnsupportedOperationException ( stacktrace provided ).
I expect the upsert operation to go through without any exceptions.
Environment Description
Hudi version : 0.15.0
Spark version : 3.4.1
Hive version :
Hadoop version :
Storage (HDFS/S3/GCS..) : S3
Running on Docker? (yes/no) : No
Additional Context
This problem does not surface when I try 20k or 30k rows for the upsert but does surface when I go higher.
Stacktrace
Caused by: org.apache.hudi.exception.HoodieException: com.esotericsoftware.kryo.KryoException: java.lang.UnsupportedOperationException
Serialization trace:
reserved (org.apache.avro.Schema$Field)
fieldMap (org.apache.avro.Schema$RecordSchema)
schema (org.apache.avro.generic.GenericData$Record)
record (com.test.hudi.RandomDataPayload)
at org.apache.hudi.common.util.queue.SimpleExecutor.execute(SimpleExecutor.java:75)
at org.apache.hudi.table.action.commit.HoodieMergeHelper.runMerge(HoodieMergeHelper.java:149)
... 33 more
Caused by: com.esotericsoftware.kryo.KryoException: java.lang.UnsupportedOperationException
Serialization trace:
reserved (org.apache.avro.Schema$Field)
fieldMap (org.apache.avro.Schema$RecordSchema)
schema (org.apache.avro.generic.GenericData$Record)
record (com.test.hudi.RandomDataPayload)
at com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:144)
at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:543)
at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:813)
at com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:161)
at com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:39)
at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:731)
at com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:543)
at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:731)
at com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:543)
at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:731)
at com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:543)
at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:813)
at org.apache.hudi.common.model.HoodieAvroRecord.readRecordPayload(HoodieAvroRecord.java:231)
at org.apache.hudi.common.model.HoodieAvroRecord.readRecordPayload(HoodieAvroRecord.java:48)
at org.apache.hudi.common.model.HoodieRecord.read(HoodieRecord.java:373)
at com.esotericsoftware.kryo.serializers.DefaultSerializers$KryoSerializableSerializer.read(DefaultSerializers.java:520)
at com.esotericsoftware.kryo.serializers.DefaultSerializers$KryoSerializableSerializer.read(DefaultSerializers.java:512)
at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:813)
at org.apache.hudi.common.util.SerializationUtils$KryoSerializerInstance.deserialize(SerializationUtils.java:102)
at org.apache.hudi.common.util.SerializationUtils.deserialize(SerializationUtils.java:76)
at org.apache.hudi.common.util.collection.BitCaskDiskMap.get(BitCaskDiskMap.java:209)
at org.apache.hudi.common.util.collection.BitCaskDiskMap.get(BitCaskDiskMap.java:202)
at org.apache.hudi.common.util.collection.BitCaskDiskMap.get(BitCaskDiskMap.java:198)
at org.apache.hudi.common.util.collection.BitCaskDiskMap.get(BitCaskDiskMap.java:67)
at org.apache.hudi.common.util.collection.ExternalSpillableMap.get(ExternalSpillableMap.java:198)
at org.apache.hudi.common.util.collection.ExternalSpillableMap.get(ExternalSpillableMap.java:56)
at org.apache.hudi.io.HoodieMergeHandle.write(HoodieMergeHandle.java:350)
at org.apache.hudi.table.action.commit.BaseMergeHelper$UpdateHandler.consume(BaseMergeHelper.java:54)
at org.apache.hudi.table.action.commit.BaseMergeHelper$UpdateHandler.consume(BaseMergeHelper.java:44)
at org.apache.hudi.common.util.queue.SimpleExecutor.execute(SimpleExecutor.java:69)
... 34 more
Caused by: java.lang.UnsupportedOperationException
at java.util.Collections$UnmodifiableCollection.add(Collections.java:1057)
at com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:134)
at com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:40)
at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:731)
at com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
... 66 more
The text was updated successfully, but these errors were encountered:
The Spark job completes successfully with the default payload. I suspect the issue lies with the custom payload. Could you please review your code to identify the problem?
@rangareddy - I'm running into the issue with the custom payload that I have linked in this ticket. Were you able to reproduce that on your end? Or are you saying that you generated the random data and then used the default payload to run the spark job?
I'm seeking help as the custom payload is causing an exception and not the default one.
Problem Description
I created a custom payload for merging records on my hudi table with record level index and then tried to upsert values in the table. The upsert operation was successful for a single record as well as batches of records less than 50k in size but as I tried with 50k records ( or more ) I ran into an UnsupportedOperationException ( stacktrace provided ).
How to reproduce the problem?
Step 1: Generate the data and hudi table
Case class that encapsulates my dataset
Spark code to generate sample dataframe
Step 2: Create hudi table from the random data
Step 3: Create a custom hudi payload class
Step 4: Create a data parcel for the upsert
Step 5: Perform the upsert operation
Expected Behavior
I expect the upsert operation to go through without any exceptions.
Environment Description
Hudi version : 0.15.0
Spark version : 3.4.1
Hive version :
Hadoop version :
Storage (HDFS/S3/GCS..) : S3
Running on Docker? (yes/no) : No
Additional Context
This problem does not surface when I try 20k or 30k rows for the upsert but does surface when I go higher.
Stacktrace
The text was updated successfully, but these errors were encountered: