Pin numpy version #6953

BLOrange-AMD · 2025-01-15T23:42:57Z

This PR is to fix incompatible numpy version of pyt_deepspeed_megatron_gpt2 and pyt_train_deepspeed_megatron_gpt2 with ROCm PyTorch release/2.5 branch.

loadams · 2025-01-16T00:23:00Z

Hi @BLOrange-AMD - I recall there being a dependency between the torch version and numpy version for us, could you share more info on the error you are seeing?

BLOrange-AMD · 2025-01-16T17:36:18Z

@loadams With ROCm PyTorch 2.5, on pyt_deepspeed_megatron_gpt2 and pyt_train_deepspeed_megatron_gpt2 models, newer numpy-2.2.1 is downloaded and used instead of using cached numpy-1.26.4, which causes "RuntimeError: Could not infer dtype of numpy.int64". So fixed numpy version from DeepSpeed could be a more stable way to solve the issue.

loadams · 2025-01-16T18:54:47Z

@loadams With ROCm PyTorch 2.5, on pyt_deepspeed_megatron_gpt2 and pyt_train_deepspeed_megatron_gpt2 models, newer numpy-2.2.1 is downloaded and used instead of using cached numpy-1.26.4, which causes "RuntimeError: Could not infer dtype of numpy.int64". So fixed numpy version from DeepSpeed could be a more stable way to solve the issue.

@BLOrange-AMD - I see, however the issue is that DeepSpeed doesn't strictly require a lower numpy version. Here is a sample workflow that uses numpy>2.0.0. Is there another way to pin the numpy version? Or do you believe this will be fixed soon in torch 2.6?

Or perhaps torch just needs to be built with numpy support too?

Updated numpy version

357a271

BLOrange-AMD requested a review from loadams as a code owner January 15, 2025 23:42

loadams changed the title ~~Updated numpy version~~ Pin numpy version Jan 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pin numpy version #6953

Pin numpy version #6953

BLOrange-AMD commented Jan 15, 2025

loadams commented Jan 16, 2025

BLOrange-AMD commented Jan 16, 2025

loadams commented Jan 16, 2025 •

edited

Loading

Pin numpy version #6953

Are you sure you want to change the base?

Pin numpy version #6953

Conversation

BLOrange-AMD commented Jan 15, 2025

loadams commented Jan 16, 2025

BLOrange-AMD commented Jan 16, 2025

loadams commented Jan 16, 2025 • edited Loading

loadams commented Jan 16, 2025 •

edited

Loading