Increasing GPU memory usage caused by the iterative use of FBP #125

TonyLi-Shu · 2024-10-27T11:43:17Z

Hi Kyle Champley,

Thanks a lot for this amazing toolkit. It is very convenient to use with Pytorch. However, I am facing an issue that my GPU memory always increases when I iteratively use FBP on different projections (either from leaptorch or leapctypes).

Background:

I want to train a neural network to identify the noise in 2D CT FBP images from the projection domain. So, my input is the projection (batch_size, Num_Projections, Num_rows, Num_cols). Thus, I need to constantly do the filter-back projection(FBP) in order to get the 2D FBP images.

Current Way:

I created a Projector using leaptorch. Then I used this proj.fbp to do the FBP for all the batches of projections. I observed that the GPU memory increases as the number of FBP increases. In the end, the GPU memory is full.

Trouble Shooting:

I have also double checked on other pytorch codes by comment the FBP operations. The GPU memory remains the same (would not increase) during the training of the neural network, which means the increasing GPU memory is cause by the FBP or probably my incorrect use of FBP.
I also tried out fbp in leaptorch, FBP_gpu in leapctypes with "inplace = True", and even the FBP_gpu in libprojectors. I observed that these three functions all make GPU memory increases when I iteratively do the FBP for each batch.
I also tuned my batch size down to 12 or 8 or 6. The gpu memory still increase as long as I do the FBP iteratively.
I tried torch.cuda.empty_cache() after del the variables as well as gc.collect(). Besides, I also tried generating a new projector for every iteration or epoch. Unfortunately, none of them works. The memory still increases.
My pytorch version is 2.4.1+cu118, LEAP version is 1.23 (I just upgrade the version to 1.23 on 2024/10/24, I believe this is the newest one)

Therefore, I dig further into the cuda codes, and I think there are many operations of GPU memory transfer (Memcpy and Memcpy3D), for which I worried that there might be a conflict between the Pytorch training and FBP? Or there might be some unfree GPU memory left by the FBP and accumulated over batches and batches.

I am wondering is my code and setting right for FBP? If it is ok. is there a way to free up the GPU memory after FBP so that it would not accumulate over iterations?

Detail info about the projector:

The current Num_Projections = 720, Num_rows=1, Num_cols=1024, batch_size=16.

Detail info about the FBP code

# self.tempo_A is the Conebeam Projector from LEAPtorch
def tempo_A_FBP(self, y):
        result_x = torch.zeros((y.size(0), 1, self.image_size, self.image_size), requires_grad=False).contiguous().to(self.device)
        y = y.contiguous()
        for i_ in range(y.size(0)):
            with torch.no_grad():
                if self.tempo_A.leapct.verify_inputs(y[i_,:,:,:], result_x[i_,:,:,:]):
                    # result_x[i_,:,:,:] = self.temp_A.fbp(y[i_,:,:,:])
                    self.tempo_A.leapct.FBP_gpu(y[i_,:,:,:], result_x[i_,:,:,:], inplace = True)
                    # self.tempo_A.leapct.libprojectors.FBP_gpu.restype = ctypes.c_bool
                    # self.tempo_A.leapct.libprojectors.FBP_gpu.argtypes = [ctypes.c_void_p, ctypes.c_void_p]
                    # self.tempo_A.leapct.set_model()
                    # self.tempo_A.leapct.libprojectors.FBP_gpu(y[i_,:,:,:].data_ptr(), result_x[i_,:,:,:].data_ptr())
                else:
                    raise Exception("Error in FBP!")

        return result_x

Erorr of the GPU Memory full

Here is the error returned to me when the GPU memory is full when I use fbp in leaptorch

And Here is another one when I used FBP_gpu in leapctypes

Let me know if there is anything else I need to provide to make this issue clear. Looking forward to hear your thoughts. Thanks a lot in advance.

The text was updated successfully, but these errors were encountered:

kylechampley · 2024-10-28T13:16:18Z

Thanks for reporting this issue.

Yes, LEAP definitely needs to create temporary memory to perform its operations and I am pretty careful about freeing the memory when I don't need it any more, but I may have missed something. I'll do some tests and get back to you.

kylechampley · 2024-10-28T13:17:58Z

FYI, if your cone-beam data only has one detector row, I recommend using a fan-beam geometry because the cone-beam geometry models the divergence of the rays in the z-direction and may clip off some of your volume.

TonyLi-Shu · 2024-10-28T15:09:11Z

Thanks a lot for the fast reply as well as the great suggestions, Kyle! Looking forward to your messages.

kylechampley · 2024-11-12T02:19:40Z

I've done a lot of stress testing of LEAP trying to find memory leaks, but I cannot find any.

Have you tried the newest version of LEAP which was released a few days ago? This version uses less memory and may help resolve your issues.

Also note that for some algorithms to work, LEAP must make temporary copies of the volume and/or projection data which are freed when the algorithm completes, but may push you beyond the available GPU memory because PyTorch typically uses a TON of memory. Although it is possible there is a memory leak in LEAP, at this point if there is still a memory issue when using the latest version of LEAP I think the memory issue has to do with PyTorch.

TonyLi-Shu changed the title ~~Increasing GPU memory usage caused by the iterative using of FBP~~ Increasing GPU memory usage caused by the iterative use of FBP Oct 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Increasing GPU memory usage caused by the iterative use of FBP #125

Increasing GPU memory usage caused by the iterative use of FBP #125

Increasing GPU memory usage caused by the iterative use of FBP #125

Increasing GPU memory usage caused by the iterative use of FBP #125

Comments

Background:

Current Way:

Trouble Shooting:

Detail info about the projector:

Detail info about the FBP code

Erorr of the GPU Memory full