-
Notifications
You must be signed in to change notification settings - Fork 74.7k
tf.image.resize_images() - weird padding behaviour? #6720
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@xmbrst could this be documented better? |
I think we'd need to understand better what's going wrong before knowing how to fix this. In what way is it weird? |
The corner alignment mechanism is different between tf.image.resize* and skimage.transform.rescale, no matter align_corners=True or False. |
Yes, but how are they different? |
In skimage, the "area" of a pixel is taken into account. In tf.image it feels like a pixel is considered as a "point" without area. This leads to a difference in alignment. E.g. when up-scaling a 2x2 image to a 4x4 image, the alignment is: Love to hear comments about which alignment is better for NN training. |
Have you considered using |
The AREA method seems to only affect interpolation but not alignment. At least I get strange results with it: import tensorflow as tf
import numpy as np
from skimage.transform import rescale
arr = np.array(
[[1,2,3,4],
[5,6,7,8],
[9,10,11,12]], dtype='float32')
input = tf.constant(arr)
input4D = tf.reshape(input, [1, 3, 4, 1])
resize = tf.image.resize_area(input4D, [6, 8], align_corners=True)[0,:,:,0]
sess = tf.Session()
r1 = sess.run(resize)
r2 = rescale(arr/100.0, 2, mode='edge') * 100 gives
|
What does that produce? We have a lot of bugs to triage, so it's helpful if people include output along with code. |
Ug, you're right, that's pretty weird. @martinwicke Our |
There are two separate issues here: For (1), @ppwwyyxx comment is exactly right: Unfortunately, there is a bug in the implementation of nearest neighbor and area interpolation methods when align_corners=True. For nearest neighbor interpolation this has already been fixed internally and will be pushed to github in the next couple of days. We will fix a similar bug for area interpolation very soon. |
@ppwwyyxx, Regarding your question on "which alignment is better for NN training", multiple approaches are possible as long as you are consistent. Here is my own favorite set of rules that we have followed in our DeepLab semantic image segmentation system: "DeepLab's Four Alignment Rules": |
Thanks @gpapan ! For (4), is it to ensure equal paddings on both side of image? UPDATE: this is to make sure that stride-2 convolution is always applied on odd-size images so there will be equal padding on both side. |
There are several issues with resize_images. It would be good to have a known-good implementation of this, even if we have to hide it behind a flag ( |
Preferably, the new implementation should follow what is used elsewhere (OpenCV, SciPy, Matlab, ...), which is to align the very corners of top-left (-0.5, -0.5) and bottom-right pixels (height - 0.5, width - 0.5) and resample using corresponding pixel centers. |
Many thanks for TF developers/contributors support. I believe this is critical part of image processing pipelines - it would be great to have this fixed soon please. |
@mohapatras antialias is only available in the 2.0 versions, which you can use by import tensorflow.compat.v2 as tf_v2 |
Can you provide examples of inputs and outputs? |
Hi John, sorry as I didn't check my code carefully. I've deleted the comment since there was a bug in my code. My results show that the result between tf.image.resize and cv2.resize are the same when using bi-linear interpolation. However, the speed difference is obvious. This might come from the GPU-performance-sake of the tensorflow platform. Thank you for your time. |
@johnpjf @martinwicke Here is my test code: import tensorflow as tf
import tensorflow.compat.v2 as tf_v2
import numpy as np
import cv2
np.set_printoptions(precision=3)
np.set_printoptions(suppress=True)
resize_shape = (10, 10)
a = np.ones((1, 2, 2, 1), dtype=np.float32)
a[0, 0, 0, 0] = 5.0
a[0, 1, 1, 0] = 5.0
b = tf.constant(a, dtype=tf.float32)
# c = tf.image.resize_bilinear(b, resize_shape)
c = tf_v2.image.resize(b, resize_shape,
method='bilinear',
antialias=True)
d = tf_v2.image.resize(c, (5, 5),
method='bilinear',
antialias=True)
with tf.Session() as sess:
np_c = sess.run(c)
np_d = sess.run(d)
temp = cv2.resize(a[0], resize_shape, interpolation=cv2.INTER_LINEAR)
temp2 = cv2.resize(np_c[0, :, :, 0], (5,5), interpolation=cv2.INTER_LINEAR)
print ("Tensorflow:")
print (np_c[0, :, :, 0])
print ("OpenCV:")
print (temp)
print ("Tensorflow:")
print (np_d[0, :, :, 0])
print ("OpenCV:")
print (temp2)
print ("Tensorflow:")
print (np_c[0, :, :, 0] - temp)
print ("OpenCV:")
print (np_d[0, :, :, 0] - temp2) Here is my output with tensorflow 1.14.0:
If I change bilinear to bicubic, there are even more inconsistent values in outputs:
Am I missing something? |
For bilinear you are using antialias=True for tensorflow, which enlarges the kernel when downsampling to antialias, this is why your downsampling version is different in TF. |
Seems bilinear upsampling also can be done with transposed convolution: Also here is some great explanation of resize in tensorflow: Some more related coreml: |
Hello, Could anybody please tell me if it's totally safe to use Thank you very much in advance! |
Yes, tf_v2.image.resize works correctly. |
@johnpjf Perfect. Thanks a lot! |
Hi, |
@MingxiLi
since the defaults for resize are:
|
Also, UpSampling2D in tf.keras sets align_corner=False as default. Even worse, it doesn't allow you to modify it from the high level api. |
@brucechou1983 |
@brucechou1983 you can use tensorflow resize with align_corner=True and wrap it inside a lambda layer in keras or tf.keras |
I know I am in the minority here, but I have an application where I actually want the resizing to align the corners like it does in tf.compat.v1.image.resize_bilinear(..., align_corners=True). I know I can access the old code by using tf.compat.v1.image.resize_bilinear( ), but I am concerned that someday that API compatibility will eventually go away, and I cannot figure out how to duplicate the old functionality using tf.image.resize( ) in version 2.0 or higher. Can someone help me figure out how to use the new tf.image.resize( ) function and duplicate the processing in the old tf.compat.v1.image.resize_bilinear( ) function with align_corners=True? Here is my code: !conda list tensor*
import tensorflow as tf
import tensorflow.compat.v1 as tf_v1
import numpy as np
import cv2
np.set_printoptions(precision=3)
np.set_printoptions(suppress=True)
resize_shape = (9,9)
a = np.ones((1, 2, 2, 1), dtype=np.float32)
a[0, 0, 0, 0] = 5.0
a[0, 1, 1, 0] = 5.0
b = tf.constant(a, dtype=tf.float32)
c1 = tf_v1.image.resize_bilinear(b, resize_shape, align_corners=True)
c2 = tf.image.resize(b, resize_shape, method='bilinear', antialias=False)
c1 = c1.numpy()
c2 = c2.numpy()
print ("\nInput")
print (a[0, :, :, 0])
print ("Tensorflow 1:")
print (c1[0, :, :, 0])
print ("Tensorflow 2:")
print (c2[0, :, :, 0]) And here is the output. I want to get the output from "Tensorflow 1" using tf.image.resize( ) in tf 2.0:
|
@dcouwenh There are valid use cases that align the center of the corner pixels. I have a change that exposes a "ScaleAndTranslateOp" that underlies the new resize, so you can use that when we merge it in (soon). Until then you can use the v1 version, or you can access the low level scale and translate op as a workaround: |
That op should also be available in `tf.raw_ops`.
|
@johnpjf Thanks for your reply. I edited my code to call the scale_and_translate op directly as you suggested, but I am not able to reproduce the results from the v1 resize_bilinear op. I am not trying to align the centers of the corner pixels, I am trying to align the corners of the corner pixels (like the v1 code did with align_corners=True). I tried to play with the scale and translate parameters, but no matter what I tried I get blocks of pixels in the corners that have a constant value. In my application, I have a vector field that I want to smoothly vary over the entire area of the output image (like the Tensorflow 1 output shows), and not have the constant value areas in the corners shown in the Tensorflow 2 and scale_and_translate outputs. Here is my edited code: !conda list tensor*
import tensorflow as tf
import tensorflow.compat.v1 as tf_v1
import numpy as np
from tensorflow.python.ops import gen_image_ops
np.set_printoptions(precision=3)
np.set_printoptions(suppress=True)
resize_shape = (9,9)
a = np.ones((1, 2, 2, 1), dtype=np.float32)
a[0, 0, 0, 0] = 5.0
a[0, 1, 1, 0] = 5.0
b = tf.constant(a, dtype=tf.float32)
c1 = tf_v1.image.resize_bilinear(b, resize_shape, align_corners=True)
c2 = tf.image.resize(b, resize_shape, method='bilinear', antialias=False)
scale = [9/2,9/2]
translate = [0,0]
c3 = gen_image_ops.scale_and_translate(
b,
resize_shape,
scale,
translate,
kernel_type='triangle',
antialias=False)
c1 = c1.numpy()
c2 = c2.numpy()
c3 = c3.numpy()
print ("\nInput")
print (a[0, :, :, 0])
print ("Tensorflow 1:")
print (c1[0, :, :, 0])
print ("Tensorflow 2:")
print (c2[0, :, :, 0])
print ("scale_and_translate:")
print (c3[0, :, :, 0]) And here is the updated output. The direct call to scale_and_translate still has the duplicate values in the corners; I am trying to get the Tensorflow 1 output which has linearly varying values coming away from the corners.
|
You can figure out what the params should be writing the equations out, so for your example you want the pixel at position 0.5 to stay at 0.5, and the one at 1.5 to transform to 8.5,
Solving gives:
so:
Matches the tf1 behavior. |
@johnpjf Excellent -- thank you! This is exactly what I was looking for. For completeness, I have edited my code to include your solution for arbitrary input and output shapes. !conda list tensor*
import tensorflow as tf
import tensorflow.compat.v1 as tf_v1
import numpy as np
from tensorflow.python.ops import gen_image_ops
np.set_printoptions(precision=3)
np.set_printoptions(suppress=True)
Hi,Wi = (3,3)
Ho,Wo = (10,9)
a = np.arange(Hi*Wi).reshape(1, Hi, Wi, 1)
b = tf.constant(a, dtype=tf.float32)
c1 = tf_v1.image.resize_bilinear(b, (Ho,Wo), align_corners=True)
c2 = tf.image.resize(b, (Ho,Wo), method='bilinear', antialias=False)
sy = (Ho-1)/(Hi-1)
sx = (Wo-1)/(Wi-1)
scale = [sy,sx]
tx = (1-sx)/2
ty = (1-sy)/2
translate = [ty,tx]
c3 = gen_image_ops.scale_and_translate(
b,
(Ho,Wo),
(sy,sx),
(ty,tx),
kernel_type='triangle',
antialias=False)
c1 = c1.numpy()
c2 = c2.numpy()
c3 = c3.numpy()
print ("\nInput")
print (a[0, :, :, 0])
print ("Tensorflow 1:")
print (c1[0, :, :, 0])
print ("Tensorflow 2:")
print (c2[0, :, :, 0])
print ("scale_and_translate:")
print (c3[0, :, :, 0]) And here is the output:
|
Use TF V2 resize for GradCam, because TF V1 bilinear resize have artifact of shifting images. tensorflow/tensorflow#6720 https://hackernoon.com/how-tensorflows-tf-image-resize-stole-60-days-of-my-life-aba5eb093f35
The tf.image.resize_images() seems to use a strange padding option, which one is not clear to me at the moment. I tried to replicate the bilinear interpolation with various padding options in for example skimage, but cant replicate the behaviour.
It would be nice to be able to set the padding option used in tf.images.resize_images(), or document what is used at least.
Example code for comparing the results of tf.images.resize_images() and skimage transform:
Looks like tf.images.resize_images() does some weird unsymmetrical padding!?
Using tensorflow 0.12.1:
The text was updated successfully, but these errors were encountered: