Skip to content

fix: handle inf/-inf in ShimSparkErrorConverter cast overflow#3768

Open
manuzhang wants to merge 1 commit intoapache:mainfrom
manuzhang:fix-shim-cast-overflow-inf
Open

fix: handle inf/-inf in ShimSparkErrorConverter cast overflow#3768
manuzhang wants to merge 1 commit intoapache:mainfrom
manuzhang:fix-shim-cast-overflow-inf

Conversation

@manuzhang
Copy link
Member

Which issue does this PR close?

Closes #3767.

Rationale for this change

Fixes incorrect exception translation for overflow cases involving infinity literals and aligns Comet behavior with Spark expectations in ANSI mode.

What changes are included in this PR?

Normalize inf literals for float/double cast overflow conversion across Spark 3.4/3.5/4.0 and add unit tests in SparkErrorConverterSuite.

How are these changes tested?

Add new UT SparkErrorConverterSuite.

@manuzhang
Copy link
Member Author

@parthchandra Please take a look when you find time.

@manuzhang manuzhang force-pushed the fix-shim-cast-overflow-inf branch from ad224ca to f38fd91 Compare March 23, 2026 12:09

private def parseFloatLiteral(value: String): Float = {
value.toLowerCase match {
case "inf" | "+inf" | "infinity" | "+infinity" => Float.PositiveInfinity
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this issue is focused on inf but do we need to do anything with nan as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Covered nan as well.

Normalize inf/nan literals for float/double cast overflow conversion across Spark 3.4/3.5/4.0 and add unit tests in SparkErrorConverterSuite for float/double inf/-inf/nan.

Co-authored-by: Codex <[email protected]>
@manuzhang manuzhang force-pushed the fix-shim-cast-overflow-inf branch from f38fd91 to 5068336 Compare March 24, 2026 02:55
}

private def parseDoubleLiteral(value: String): Double = {
value.toLowerCase match {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In conversion_funcs/numeric.rs:spark_cast_nonintegral_numeric_to_integral the calls to cast_float_to_int16_down and cast_float_to_int32_up explicitly format the string with "{:e}D" (a suffix D).
I think inf and nan will get this D suffix and the resultant string infD or nanD would not match.
The unit tests below will not catch this either.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CastOverFlow conversion throws NumberFormatException for inf/-inf in ShimSparkErrorConverter (Spark 3.4/3.5/4.0)

3 participants