Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Empty Intersection in the ChesapeakeCVPR GeoDataset w/ RandomGeoSampler #2406

Open
arjunarao619 opened this issue Nov 12, 2024 · 0 comments
Open
Assignees
Labels
datasets Geospatial or benchmark datasets

Comments

@arjunarao619
Copy link

Description

It appears that RandomGeoSampler is attempting to sample a window from the ChesapeakeCVPR dataset that is either out of bounds, or is empty. Rasterio is not able to handle this and errors out. Full stacktrace:

61502 Traceback (most recent call last):
61503   File "/media/share/share/projects/geolayers/train_baseline.py", line 432, in train
61504     for i, data in enumerate(testloader,0):
61505   File "/media/share/share/envs/tgeo/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 701, in __next__
61506     data = self._next_data()
61507   File "/media/share/share/envs/tgeo/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1445, in _next_data
61508     return self._process_data(data)
61509   File "/media/share/share/envs/tgeo/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1491, in _process_data
61510     data.reraise()
61511   File "/media/share/share/envs/tgeo/lib/python3.10/site-packages/torch/_utils.py", line 715, in reraise
61512     raise exception
61513 ValueError: Caught ValueError in DataLoader worker process 4.
61514 Original Traceback (most recent call last):
61515   File "/media/share/share/envs/tgeo/lib/python3.10/site-packages/rasterio/mask.py", line 80, in raster_geometry_mask
61516     window = geometry_window(dataset, shapes, pad_x=pad_x, pad_y=pad_y)
61517   File "/media/share/share/envs/tgeo/lib/python3.10/site-packages/rasterio/features.py", line 477, in geometry_window
61518     window = window.intersection(raster_window)
61519   File "/media/share/share/envs/tgeo/lib/python3.10/site-packages/rasterio/windows.py", line 775, in intersection
61520     return intersection([self, other])
61521   File "/media/share/share/envs/tgeo/lib/python3.10/site-packages/rasterio/windows.py", line 125, in wrapper
61522     return function(*args[0])
61523   File "/media/share/share/envs/tgeo/lib/python3.10/site-packages/rasterio/windows.py", line 239, in intersection
61524     return functools.reduce(_intersection, windows)
61525   File "/media/share/share/envs/tgeo/lib/python3.10/site-packages/rasterio/windows.py", line 257, in _intersection
61526     raise WindowError(f"Intersection is empty {w1} {w2}")
61527 rasterio.errors.WindowError: Intersection is empty Window(col_off=-205, row_off=6158, width=201, height=201) Window(col_off=0, row_off=0, width=4901, height=6511)
61528 
61529 During handling of the above exception, another exception occurred:
61530 
61531 Traceback (most recent call last):
61532   File "/media/share/share/envs/tgeo/lib/python3.10/site-packages/torch/utils/data/_utils/worker.py", line 351, in _worker_loop
61533     data = fetcher.fetch(index)  # type: ignore[possibly-undefined]
61534   File "/media/share/share/envs/tgeo/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 52, in fetch
61535     data = [self.dataset[idx] for idx in possibly_batched_index]
61536   File "/media/share/share/envs/tgeo/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 52, in <listcomp>
61537     data = [self.dataset[idx] for idx in possibly_batched_index]
61538   File "/media/share/share/envs/tgeo/lib/python3.10/site-packages/torchgeo/datasets/chesapeake.py", line 559, in __getitem__
61539     data, _ = rasterio.mask.mask(
61540   File "/media/share/share/envs/tgeo/lib/python3.10/site-packages/rasterio/mask.py", line 178, in mask
61541     shape_mask, transform, window = raster_geometry_mask(
61542   File "/media/share/share/envs/tgeo/lib/python3.10/site-packages/rasterio/mask.py", line 86, in raster_geometry_mask
61543     raise ValueError('Input shapes do not overlap raster.')
61544 ValueError: Input shapes do not overlap raster.

Steps to reproduce

This error is rather random – it generally can occur at any given iteration in the training process on any epoch. Here are steps to reproduce it.

  1. Create a ChesapeakeCVPR dataset:
from torchgeo.datasets import ChesapeakeCVPR

states = ['de', 'md', 'va', 'wv', 'pa', 'ny']
spl_train =  [f'{state}-train' for state in states]
spl_val = ([f'{state}-val' for state in states])
spl_test = ([f'{state}-test' for state in states])

trainset = ChesapeakeCVPR(root='/share/chesapeake/cvpr_chesapeake_landcover', download=False, cache=True, layers=modality, splits=spl_train, transforms=None)
  1. Initialize a RandomGeoSampler and dataloader
from torchgeo.samplers import RandomGeoSampler, RandomBatchGeoSampler

trainsampler = RandomGeoSampler(trainset, size=256, units=torchgeo.samplers.Units.PIXELS, generator=generator)
trainloader = torch.utils.data.DataLoader(trainset, sampler=trainsampler, batch_size=BATCH_SIZE, num_workers=cfg['num_workers'], drop_last=False, generator=generator, collate_fn=stack_samples)
  1. Iterate through the dataloader. Ideally, you should catch an exception at some point.

Version

0.7.0.dev0

@adamjstewart adamjstewart added the datasets Geospatial or benchmark datasets label Nov 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datasets Geospatial or benchmark datasets
Projects
None yet
Development

No branches or pull requests

3 participants