8000 tf.image.resize_images() - weird padding behaviour? · Issue #6720 · tensorflow/tensorflow · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

tf.image.resize_images() - weird padding behaviour? #6720

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
JoelKronander opened this issue Jan 8, 2017 · 60 comments
Closed

tf.image.resize_images() - weird padding behaviour? #6720

JoelKronander opened this issue Jan 8, 2017 · 60 comments
8000 Assignees
Labels
stat:contribution welcome Status - Contributions welcome type:bug Bug

Comments

@JoelKronander
Copy link

The tf.image.resize_images() seems to use a strange padding option, which one is not clear to me at the moment. I tried to replicate the bilinear interpolation with various padding options in for example skimage, but cant replicate the behaviour.

It would be nice to be able to set the padding option used in tf.images.resize_images(), or document what is used at least.

Example code for comparing the results of tf.images.resize_images() and skimage transform:
Looks like tf.images.resize_images() does some weird unsymmetrical padding!?
Using tensorflow 0.12.1:

import tensorflow as tf
import tensorlayer as tl
import numpy as np
import skimage
from scipy.misc import imread, imresize, imsave

sess = tf.InteractiveSession()

#create simple test image
imsize = 3
xa, ya = np.ogrid[:imsize, :imsize]
img = np.repeat((xa + ya)[..., np.newaxis], 3, 2) / float(imsize + imsize)

x = tf.placeholder(tf.float32, [1, imsize, imsize, 3])
y = tf.image.resize_images(x,(imsize*3, imsize*3))

sess.run(tf.global_variables_initializer())

upsampled_tf_result = sess.run(y, feed_dict={x: [img]})
upsampled_skimage_result = skimage.transform.rescale(img,
                                     3,
                                     mode='symmetric',
                                     cval=0,
                                     order=1,
                                     preserve_range=False)

print(np.allclose(upsampled_tf_result, upsampled_skimage_result))

imsave('upsampled_tf_result.png', np.squeeze(upsampled_tf_result))
imsave('upsampled_skimage_result.png', upsampled_skimage_result)
@michaelisard michaelisard added the type:docs-bug Document issues label Jan 9, 2017
@michaelisard
Copy link

@xmbrst could this be documented better?

@xmbrst xmbrst assigned dr4b and unassigned xmbrst Jan 9, 2017
@girving
Copy link
Contributor
girving commented Apr 28, 2017

I think we'd need to understand better what's going wrong before knowing how to fix this. In what way is it weird?

@girving girving added the stat:awaiting response Status - Awaiting response from author label Apr 28, 2017
@ppwwyyxx
Copy link
Contributor

The corner alignment mechanism is different between tf.image.resize* and skimage.transform.rescale, no matter align_corners=True or False.

@girving
Copy link
Contributor
girving commented Apr 28, 2017

Yes, but how are they different?

@ppwwyyxx
Copy link
Contributor
ppwwyyxx commented Apr 28, 2017

In skimage, the "area" of a pixel is taken into account. In tf.image it feels like a pixel is considered as a "point" without area. This leads to a difference in alignment.

E.g. when up-scaling a 2x2 image to a 4x4 image, the alignment is:
skimage: (0, 0) -> (0.5, 0.5), (1, 1) -> (2.5, 2.5)
tf.image.resize_*(align_corners=True): (0, 0) -> (0, 0), (1, 1) -> (3, 3)
tf.image.resize_*(align_corners=False): (0, 0) -> (0, 0), (1, 1) -> (2, 2)

Love to hear comments about which alignment is better for NN training.

@girving
Copy link
Contributor
girving commented Apr 28, 2017

Have you considered using method=ResizeMethod.AREA if you want area based resizing?

@ppwwyyxx
Copy link
Contributor
ppwwyyxx commented Apr 28, 2017

The AREA method seems to only affect interpolation but not alignment. At least I get strange results with it:

import tensorflow as tf
import numpy as np
from skimage.transform import rescale

arr = np.array(
    [[1,2,3,4],
    [5,6,7,8],
    [9,10,11,12]], dtype='float32')

input = tf.constant(arr)
input4D = tf.reshape(input, [1, 3, 4, 1])
resize = tf.image.resize_area(input4D, [6, 8], align_corners=True)[0,:,:,0]
sess = tf.Session()
r1 = sess.run(resize)
r2 = rescale(arr/100.0, 2, mode='edge') * 100

gives

r1=array([[  1.        ,   1.        ,   1.66666651,   2.        ,   2.33333278,   3.        ,   3.        ,   4.00000048],
       [  1.        ,   1.        ,   1.66666651,   2.        ,   2.33333278,   3.        ,   3.        ,   4.00000048],
       [  3.00000024,   3.00000024,   3.66666651,   4.00000048,   4.33333302,   5.00000048,   5.00000095,   6.00000048],
       [  5.        ,   5.        ,   5.66666603,   6.        ,   6.33333206,   7.        ,   7.00000048,   8.00000095],
       [  4.99999952,   4.99999952,   5.66666555,   5.99999952,   6.33333111,   6.99999905,   7.        ,   8.        ],
       [  9.00000191,   9.00000191,   9.66666698,  10.00000191,  10.33333397,  11.00000191,  11.00000191,  12.00000286]], dtype=float32)
r2=array([[  0.99999998,   1.24999997,   1.74999996,   2.24999995,   2.74999994,   3.24999993,   3.74999992,   3.99999991],
       [  2.        ,   2.24999998,   2.74999995,   3.24999994,   3.74999995,   4.24999994,   4.74999991,   4.99999989],
       [  4.00000005,   4.25000001,   4.74999993,   5.24999992,   5.74999998,   6.24999997,   6.74999988,   6.99999984],
       [  6.00000015,   6.25000009,   6.74999999,   7.24999995,   7.74999999,   8.24999996,   8.74999985,   8.9999998 ],
       [  8.00000029,   8.25000023,   8.75000013,   9.25000005,   9.74999999,  10.24999991,  10.74999981,  10.99999975],
       [  9.00000036,   9.25000031,   9.7500002 ,  10.2500001 ,  10.74999999,  11.24999989,  11.74999978,  11.99999973]])

@girving
Copy link
Contributor
girving commented Apr 28, 2017

What does that produce? We have a lot of bugs to triage, so it's helpful if people include output along with code.

@girving
Copy link
Contributor
girving commented Apr 28, 2017

Ug, you're right, that's pretty weird. @martinwicke Our tf.image.resize_area function isn't even reflection equivariant. It would be lovely to fix this, but I'd be worried about breaking old models.

@girving girving added stat:awaiting tensorflower Status - Awaiting response from tensorflower type:bug Bug and removed stat:awaiting response Status - Awaiting response from author type:docs-bug Document issues labels Apr 28, 2017
@gpapan
Copy link
gpapan commented Apr 29, 2017

There are two separate issues here:
(1) Alignment of the tensor values at the input and output of the resize function.
(2) Interpolation method.

For (1), @ppwwyyxx comment is exactly right:
When using align_corners=True, we consider the image value as a point sample of a continuous function at the pixel center. When using align_corners=False, we consider the image value as the average of a continuous function over a 1x1 pixel square centered at the pixel center.

Unfortunately, there is a bug in the implementation of nearest neighbor and area interpolation methods when align_corners=True. For nearest neighbor interpolation this has already been fixed internally and will be pushed to github in the next couple of days. We will fix a similar bug for area interpolation very soon.

@gpapan
Copy link
gpapan commented Apr 29, 2017

@ppwwyyxx, Regarding your question on "which alignment is better for NN training", multiple approaches are possible as long as you are consistent. Here is my own favorite set of rules that we have followed in our DeepLab semantic image segmentation system:

"DeepLab's Four Alignment Rules":
(1) Use of odd-sized kernels in all convolution and pooling ops.
(2) Use of SAME boundary conditions in all convolution and pooling ops.
(3) Use align_corners=True when upsampling feature maps with bilinear interpolation.
(4) Use of inputs with height/width equal to a multiple of the output_stride, plus one (for example, when the CNN output stride is 8, use height or width equal to 8 * n + 1, for some n, e.g., image HxW set to 321x513).

@ppwwyyxx
Copy link
Contributor
ppwwyyxx commented Apr 30, 2017

Thanks @gpapan ! For (4), is it to ensure equal paddings on both side of image?

UPDATE: this is to make sure that stride-2 convolution is always applied on odd-size images so there will be equal padding on both side.

@martinwicke
Copy link
Member

There are several issues with resize_images. It would be good to have a known-good implementation of this, even if we have to hide it behind a flag (correct=False).

@martinwicke martinwicke added stat:contribution welcome Status - Contributions welcome and removed stat:awaiting tensorflower Status - Awaiting response from tensorflower labels Jan 22, 2018
@tpet
Copy link
tpet commented Apr 13, 2018

Preferably, the new implementation should follow what is used elsewhere (OpenCV, SciPy, Matlab, ...), which is to align the very corners of top-left (-0.5, -0.5) and bottom-right pixels (height - 0.5, width - 0.5) and resample using corresponding pixel centers.

@rsethur
Copy link
rsethur commented May 18, 2018

Many thanks for TF developers/contributors support. I believe this is critical part of image processing pipelines - it would be great to have this fixed soon please.
Detailed writeup by Oleksandr Savsunenko:
https://hackernoon.com/how-tensorflows-tf-image-resize-stole-60-days-of-my-life-aba5eb093f35

@johnpjf
Copy link
johnpjf commented Jun 27, 2019

@mohapatras antialias is only available in the 2.0 versions, which you can use by

import tensorflow.compat.v2 as tf_v2
...
tf_v2.image.resize(..., antialias=True, ...)

@johnpjf
Copy link
johnpjf 8000 commented Jul 12, 2019

Can you provide examples of inputs and outputs?

@biendltb
Copy link
biendltb commented Jul 12, 2019

Can you provide examples of inputs and outputs?

Hi John, sorry as I didn't check my code carefully. I've deleted the comment since there was a bug in my code. My results show that the result between tf.image.resize and cv2.resize are the same when using bi-linear interpolation. However, the speed difference is obvious. This might come from the GPU-performance-sake of the tensorflow platform. Thank you for your time.

@protossw512
Copy link
protossw512 commented Aug 13, 2019

@johnpjf @martinwicke
Thank you for the fix, however, I still found the new implementation is not aligned with OpenCV under bilinear interpolation, is that desired behavior?

Here is my test code:

import tensorflow as tf
import tensorflow.compat.v2 as tf_v2
import numpy as np
import cv2
np.set_printoptions(precision=3)
np.set_printoptions(suppress=True)
resize_shape = (10, 10)

a = np.ones((1, 2, 2, 1), dtype=np.float32)
a[0, 0, 0, 0] = 5.0
a[0, 1, 1, 0] = 5.0

b = tf.constant(a, dtype=tf.float32)
# c = tf.image.resize_bilinear(b, resize_shape)
c = tf_v2.image.resize(b, resize_shape,
                       method='bilinear',
                       antialias=True)
d = tf_v2.image.resize(c, (5, 5),
                       method='bilinear',
                       antialias=True)

with tf.Session() as sess:
    np_c = sess.run(c)
    np_d = sess.run(d)

temp = cv2.resize(a[0], resize_shape, interpolation=cv2.INTER_LINEAR)
temp2 = cv2.resize(np_c[0, :, :, 0], (5,5), interpolation=cv2.INTER_LINEAR)

print ("Tensorflow:")
print (np_c[0, :, :, 0])
print ("OpenCV:")
print (temp)
print ("Tensorflow:")
print (np_d[0, :, :, 0])
print ("OpenCV:")
print (temp2)
print ("Tensorflow:")
print (np_c[0, :, :, 0] - temp)
print ("OpenCV:")
print (np_d[0, :, :, 0] - temp2)

Here is my output with tensorflow 1.14.0:
The upsample looks correct, but there seems to be some issue with downsample.

Tensorflow:
[[5.   5.   5.   4.2  3.4  2.6  1.8  1.   1.   1.  ]
 [5.   5.   5.   4.2  3.4  2.6  1.8  1.   1.   1.  ]
 [5.   5.   5.   4.2  3.4  2.6  1.8  1.   1.   1.  ]
 [4.2  4.2  4.2  3.72 3.24 2.76 2.28 1.8  1.8  1.8 ]
 [3.4  3.4  3.4  3.24 3.08 2.92 2.76 2.6  2.6  2.6 ]
 [2.6  2.6  2.6  2.76 2.92 3.08 3.24 3.4  3.4  3.4 ]
 [1.8  1.8  1.8  2.28 2.76 3.24 3.72 4.2  4.2  4.2 ]
 [1.   1.   1.   1.8  2.6  3.4  4.2  5.   5.   5.  ]
 [1.   1.   1.   1.8  2.6  3.4  4.2  5.   5.   5.  ]
 [1.   1.   1.   1.8  2.6  3.4  4.2  5.   5.   5.  ]]
OpenCV:
[[5.   5.   5.   4.2  3.4  2.6  1.8  1.   1.   1.  ]
 [5.   5.   5.   4.2  3.4  2.6  1.8  1.   1.   1.  ]
 [5.   5.   5.   4.2  3.4  2.6  1.8  1.   1.   1.  ]
 [4.2  4.2  4.2  3.72 3.24 2.76 2.28 1.8  1.8  1.8 ]
 [3.4  3.4  3.4  3.24 3.08 2.92 
8000
2.76 2.6  2.6  2.6 ]
 [2.6  2.6  2.6  2.76 2.92 3.08 3.24 3.4  3.4  3.4 ]
 [1.8  1.8  1.8  2.28 2.76 3.24 3.72 4.2  4.2  4.2 ]
 [1.   1.   1.   1.8  2.6  3.4  4.2  5.   5.   5.  ]
 [1.   1.   1.   1.8  2.6  3.4  4.2  5.   5.   5.  ]
 [1.   1.   1.   1.8  2.6  3.4  4.2  5.   5.   5.  ]]
Tensorflow:
[[5.    4.5   3.    1.5   1.   ]
 [4.5   4.125 3.    1.875 1.5  ]
 [3.    3.    3.    3.    3.   ]
 [1.5   1.875 3.    4.125 4.5  ]
 [1.    1.5   3.    4.5   5.   ]]
OpenCV:
[[5.   4.6  3.   1.4  1.  ]
 [4.6  4.28 3.   1.72 1.4 ]
 [3.   3.   3.   3.   3.  ]
 [1.4  1.72 3.   4.28 4.6 ]
 [1.   1.4  3.   4.6  5.  ]]
Tensorflow:
[[ 0.  0.  0.  0. -0.  0. -0.  0.  0.  0.]
 [ 0.  0.  0.  0. -0.  0. -0.  0.  0.  0.]
 [ 0.  0.  0.  0. -0.  0. -0.  0.  0.  0.]
 [ 0.  0.  0.  0. -0. -0. -0.  0.  0.  0.]
 [-0. -0. -0.  0. -0.  0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.]
 [-0. -0. -0.  0.  0.  0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.]]
OpenCV:
[[ 0.    -0.1    0.     0.1    0.   ]
 [-0.1   -0.155 -0.     0.155  0.1  ]
 [ 0.     0.     0.     0.     0.   ]
 [ 0.1    0.155  0.    -0.155 -0.1  ]
 [ 0.     0.1    0.    -0.1    0.   ]]

If I change bilinear to bicubic, there are even more inconsistent values in outputs:

Tensorflow:
[[ 6.03   5.833  5.462  4.696  3.598  2.402  1.304  0.538  0.167 -0.03 ]
 [ 5.833  5.649  5.302  4.586  3.559  2.441  1.414  0.698  0.351  0.167]
 [ 5.462  5.302  5.     4.378  3.486  2.514  1.622  1.     0.698  0.538]
 [ 4.696  4.586  4.378  3.949  3.335  2.665  2.051  1.622  1.414  1.304]
 [ 3.598  3.559  3.486  3.335  3.118  2.882  2.665  2.514  2.441  2.402]
 [ 2.402  2.441  2.514  2.665  2.882  3.118  3.335  3.486  3.559  3.598]
 [ 1.304  1.414  1.622  2.051  2.665  3.335  3.949  4.378  4.586  4.696]
 [ 0.538  0.698  1.     1.622  2.514  3.486  4.378  5.     5.302  5.462]
 [ 0.167  0.351  0.698  1.414  2.441  3.559  4.586  5.302  5.649  5.833]
 [-0.03   0.167  0.538  1.304  2.402  3.598  4.696  5.462  5.833  6.03 ]]
OpenCV:
[[5.957 5.899 5.432 4.576 3.545 2.455 1.424 0.568 0.101 0.043]
 [5.899 5.842 5.384 4.545 3.534 2.466 1.455 0.616 0.158 0.101]
 [5.432 5.384 5.    4.296 3.448 2.552 1.704 1.    0.616 0.568]
 [4.576 4.545 4.296 3.84  3.29  2.71  2.16  1.704 1.455 1.424]
 [3.545 3.534 3.448 3.29  3.1   2.9   2.71  2.552 2.466 2.455]
 [2.455 2.466 2.552 2.71  2.9   3.1   3.29  3.448 3.534 3.545]
 [1.424 1.455 1.704 2.16  2.71  3.29  3.84  4.296 4.545 4.576]
 [0.568 0.616 1.    1.704 2.552 3.448 4.296 5.    5.384 5.432]
 [0.101 0.158 0.616 1.455 2.466 3.534 4.545 5.384 5.842 5.899]
 [0.043 0.101 0.568 1.424 2.455 3.545 4.576 5.432 5.899 5.957]]
Tensorflow:
[[5.873 5.046 3.    0.954 0.127]
 [5.046 4.457 3.    1.543 0.954]
 [3.    3.    3.    3.    3.   ]
 [0.954 1.543 3.    4.457 5.046]
 [0.127 0.954 3.    5.046 5.873]]
OpenCV:
[[5.904 5.102 3.    0.898 0.096]
 [5.102 4.521 3.    1.479 0.898]
 [3.    3.    3.    3.    3.   ]
 [0.898 1.479 3.    4.521 5.102]
 [0.096 0.898 3.    5.102 5.904]]
Tensorflow:
[[ 0.072 -0.066  0.03   0.12   0.053 -0.053 -0.12  -0.03   0.066 -0.072]
 [-0.066 -0.192 -0.082  0.041  0.025 -0.025 -0.041  0.082  0.192  0.066]
 [ 0.03  -0.082  0.     0.082  0.038 -0.038 -0.082  0.     0.082 -0.03 ]
 [ 0.12   0.041  0.082  0.109  0.044 -0.044 -0.109 -0.082 -0.041 -0.12 ]
 [ 0.053  0.025  0.038  0.044  0.018 -0.018 -0.044 -0.038 -0.025 -0.053]
 [-0.053 -0.025 -0.038 -0.044 -0.018  0.018  0.044  0.038  0.025  0.053]
 [-0.12  -0.041 -0.082 -0.109 -0.044  0.044  0.109  0.082  0.041  0.12 ]
 [-0.03   0.082  0.    -0.082 -0.038  0.038  0.082  0.    -0.082  0.03 ]
 [ 0.066  0.192  0.082 -0.041 -0.025  0.025  0.041 -0.082 -0.192 -0.066]
 [-0.072  0.066 -0.03  -0.12  -0.053  0.053  0.12   0.03  -0.066  0.072]]
OpenCV:
[[-0.031 -0.056  0.     0.056  0.031]
 [-0.056 -0.064  0.     0.064  0.056]
 [ 0.     0.     0.     0.     0.   ]
 [ 0.056  0.064 -0.    -0.064 -0.056]
 [ 0.031  0.056  0.    -0.056 -0.031]]

Am I missing something?

@johnpjf
Copy link
johnpjf commented Aug 13, 2019

For bilinear you are using antialias=True for tensorflow, which enlarges the kernel when downsampling to antialias, this is why your downsampling version is different in TF.
For bicubic you also have antialias=True, but the upsampling version is also different. This looks like it's from a different choice of the parameter in bicubic kernel https://en.wikipedia.org/wiki/Bicubic_interpolation#Bicubic_convolution_algorithm
TF uses a=0.5, while opencv uses a=0.75.

@mrgloom
Copy link
mrgloom commented Sep 6, 2019

Seems bilinear upsampling also can be done with transposed convolution:
http://warmspringwinds.github.io/tensorflow/tf-slim/2016/11/22/upsampling-and-image-segmentation-with-tensorflow-and-tf-slim/

Also here is some great explanation of resize in tensorflow:
https://jricheimer.github.io/tensorflow/2019/02/11/resize-confusion/

Some more related coreml:
https://machinethink.net/blog/coreml-upsampling/

@netw0rkf10w
Copy link

Hello,

Could anybody please tell me if it's totally safe to use tf.image.resize in TF 2.0 now?

Thank you very much in advance!

@johnpjf
Copy link
johnpjf commented Nov 6, 2019

Yes, tf_v2.image.resize works correctly.

@netw0rkf10w
Copy link

@johnpjf Perfect. Thanks a lot!

@MingxiLi
Copy link
MingxiLi commented Dec 24, 2019 A3E2

Hi,
Could anyone please tell me if it is safe to use tf.image.resize_bilinear in tf 1.12?

@johnpjf
Copy link
johnpjf commented Dec 24, 2019

@MingxiLi
In tf 1.12 I believe you would need to use the tf 2.0 version, so something like this would match resize_bilinear:

tf.compat.v2.image.resize(
        image, new_shape,
    )

since the defaults for resize are:

method=tf.compat.v2.image.ResizeMethod.BILINEAR, antialias=False,

@brucechou1983
Copy link

Also, UpSampling2D in tf.keras sets align_corner=False as default. Even worse, it doesn't allow you to modify it from the high level api.

@johnpjf
Copy link
johnpjf commented Feb 11, 2020

@brucechou1983
Why do you want to set align_corner=True in UpSampling2D?

@anilsathyan7
Copy link

@brucechou1983 you can use tensorflow resize with align_corner=True and wrap it inside a lambda layer in keras or tf.keras

@dcouwenh
Copy link
dcouwenh commented Jun 9, 2020

I know I am in the minority here, but I have an application where I actually want the resizing to align the corners like it does in tf.compat.v1.image.resize_bilinear(..., align_corners=True). I know I can access the old code by using tf.compat.v1.image.resize_bilinear( ), but I am concerned that someday that API compatibility will eventually go away, and I cannot figure out how to duplicate the old functionality using tf.image.resize( ) in version 2.0 or higher.

Can someone help me figure out how to use the new tf.image.resize( ) function and duplicate the processing in the old tf.compat.v1.image.resize_bilinear( ) function with align_corners=True?

Here is my code:

!conda list tensor*
import tensorflow as tf
import tensorflow.compat.v1 as tf_v1
import numpy as np
import cv2

np.set_printoptions(precision=3)
np.set_printoptions(suppress=True)

resize_shape = (9,9)

a = np.ones((1, 2, 2, 1), dtype=np.float32)
a[0, 0, 0, 0] = 5.0
a[0, 1, 1, 0] = 5.0

b = tf.constant(a, dtype=tf.float32)

c1 = tf_v1.image.resize_bilinear(b, resize_shape, align_corners=True)
c2 = tf.image.resize(b, resize_shape, method='bilinear', antialias=False)

c1 = c1.numpy()
c2 = c2.numpy()

print ("\nInput")
print (a[0, :, :, 0])
print ("Tensorflow 1:")
print (c1[0, :, :, 0])
print ("Tensorflow 2:")
print (c2[0, :, :, 0])

And here is the output. I want to get the output from "Tensorflow 1" using tf.image.resize( ) in tf 2.0:

# packages in environment at /home/dcouwenh/.conda/envs/hornet2:
#
# Name                    Version                   Build  Channel
tensorboard               2.0.0              pyhb230dea_0  
tensorflow                2.0.0           gpu_py36h6b29c10_0  
tensorflow-base           2.0.0           gpu_py36h0ec5d1f_0  
tensorflow-estimator      2.0.0              pyh2649769_0  
tensorflow-gpu            2.0.0                h0d30ee6_0  

Input
[[5. 1.]
 [1. 5.]]
Tensorflow 1:
[[5.    4.5   4.    3.5   3.    2.5   2.    1.5   1.   ]
 [4.5   4.125 3.75  3.375 3.    2.625 2.25  1.875 1.5  ]
 [4.    3.75  3.5   3.25  3.    2.75  2.5   2.25  2.   ]
 [3.5   3.375 3.25  3.125 3.    2.875 2.75  2.625 2.5  ]
 [3.    3.    3.    3.    3.    3.    3.    3.    3.   ]
 [2.5   2.625 2.75  2.875 3.    3.125 3.25  3.375 3.5  ]
 [2.    2.25  2.5   2.75  3.    3.25  3.5   3.75  4.   ]
 [1.5   1.875 2.25  2.625 3.    3.375 3.75  4.125 4.5  ]
 [1.    1.5   2.    2.5   3.    3.5   4.    4.5   5.   ]]
Tensorflow 2:
[[5.    5.    4.778 3.889 3.    2.111 1.222 1.    1.   ]
 [5.    5.    4.778 3.889 3.    2.111 1.222 1.    1.   ]
 [4.778 4.778 4.58  3.79  3.    2.21  1.42  1.222 1.222]
 [3.889 3.889 3.79  3.395 3.    2.605 2.21  2.111 2.111]
 [3.    3.    3.    3.    3.    3.    3.    3.    3.   ]
 [2.111 2.111 2.21  2.605 3.    3.395 3.79  3.889 3.889]
 [1.222 1.222 1.42  2.21  3.    3.79  4.58  4.778 4.778]
 [1.    1.    1.222 2.111 3.    3.889 4.778 5.    5.   ]
 [1.    1.    1.222 2.111 3.    3.889 4.778 5.    5.   ]]

@johnpjf
Copy link
johnpjf commented Jun 11, 2020

@dcouwenh There are valid use cases that align the center of the corner pixels. I have a change that exposes a "ScaleAndTranslateOp" that underlies the new resize, so you can use that when we merge it in (soon).

Until then you can use the v1 version, or you can access the low level scale and translate op as a workaround:
https://codeclimate.com/github/tensorflow/tensorflow/tensorflow/python/ops/image_ops_impl.py/source
at line 1510.

@martinwicke
Copy link
Member
martinwicke commented Jun 11, 2020 via email

@dcouwenh
Copy link

@johnpjf Thanks for your reply. I edited my code to call the scale_and_translate op directly as you suggested, but I am not able to reproduce the results from the v1 resize_bilinear op. I am not trying to align the centers of the corner pixels, I am trying to align the corners of the corner pixels (like the v1 code did with align_corners=True).

I tried to play with the scale and translate parameters, but no matter what I tried I get blocks of pixels in the corners that have a constant value. In my application, I have a vector field that I want to smoothly vary over the entire area of the output image (like the Tensorflow 1 output shows), and not have the constant value areas in the corners shown in the Tensorflow 2 and scale_and_translate outputs.

Here is my edited code:

!conda list tensor*
import tensorflow as tf
import tensorflow.compat.v1 as tf_v1
import numpy as np
from tensorflow.python.ops import gen_image_ops

np.set_printoptions(precision=3)
np.set_printoptions(suppress=True)

resize_shape = (9,9)

a = np.ones((1, 2, 2, 1), dtype=np.float32)
a[0, 0, 0, 0] = 5.0
a[0, 1, 1, 0] = 5.0

b = tf.constant(a, dtype=tf.float32)

c1 = tf_v1.image.resize_bilinear(b, resize_shape, align_corners=True)
c2 = tf.image.resize(b, resize_shape, method='bilinear', antialias=False)

scale = [9/2,9/2]
translate = [0,0]
c3 = gen_image_ops.scale_and_translate(
          b,
          resize_shape,
          scale,
          translate,
          kernel_type='triangle',
          antialias=False)

c1 = c1.numpy()
c2 = c2.numpy()
c3 = c3.numpy()

print ("\nInput")
print (a[0, :, :, 0])
print ("Tensorflow 1:")
print (c1[0, :, :, 0])
print ("Tensorflow 2:")
print (c2[0, :, :, 0])
print ("scale_and_translate:")
print (c3[0, :, :, 0])

And here is the updated output. The direct call to scale_and_translate still has the duplicate values in the corners; I am trying to get the Tensorflow 1 output which has linearly varying values coming away from the corners.

# packages in environment at /home/dcouwenh/.conda/envs/hornet2:
#
# Name                    Version                   Build  Channel
tensorboard               2.2.1              pyh532a8cf_0  
tensorboard-plugin-wit    1.6.0                      py_0  
tensorflow                2.1.0           gpu_py37h7a4bb67_0  
tensorflow-base           2.1.0           gpu_py37h6c5654b_0  
tensorflow-estimator      2.1.0              pyhd54b08b_0  
tensorflow-gpu            2.1.0                h0d30ee6_0  

Input
[[5. 1.]
 [1. 5.]]
Tensorflow 1:
[[5.    4.5   4.    3.5   3.    2.5   2.    1.5   1.   ]
 [4.5   4.125 3.75  3.375 3.    2.625 2.25  1.875 1.5  ]
 [4.    3.75  3.5   3.25  3.    2.75  2.5   2.25  2.   ]
 [3.5   3.375 3.25  3.125 3.    2.875 2.75  2.625 2.5  ]
 [3.    3.    3.    3.    3.    3.    3.    3.    3.   ]
 [2.5   2.625 2.75  2.875 3.    3.125 3.25  3.375 3.5  ]
 [2.    2.25  2.5   2.75  3.    3.25  3.5   3.75  4.   ]
 [1.5   1.875 2.25  2.625 3.    3.375 3.75  4.125 4.5  ]
 [1.    1.5   2.    2.5   3.    3.5   4.    4.5   5.   ]]
Tensorflow 2:
[[5.    5.    4.778 3.889 3.    2.111 1.222 1.    1.   ]
 [5.    5.    4.778 3.889 3.    2.111 1.222 1.    1.   ]
 [4.778 4.778 4.58  3.79  3.    2.21  1.42  1.222 1.222]
 [3.889 3.889 3.79  3.395 3.    2.605 2.21  2.111 2.111]
 [3.    3.    3.    3.    3.    3.    3.    3.    3.   ]
 [2.111 2.111 2.21  2.605 3.    3.395 3.79  3.889 3.889]
 [1.222 1.222 1.42  2.21  3.    3.79  4.58  4.778 4.778]
 [1.    1.    1.222 2.111 3.    3.889 4.778 5.    5.   ]
 [1.    1.    1.222 2.111 3.    3.889 4.778 5.    5.   ]]
scale_and_translate:
[[5.    5.    4.778 3.889 3.    2.111 1.222 1.    1.   ]
 [5.    5.    4.778 3.889 3.    2.111 1.222 1.    1.   ]
 [4.778 4.778 4.58  3.79  3.    2.21  1.42  1.222 1.222]
 [3.889 3.889 3.79  3.395 3.    2.605 2.21  2.111 2.111]
 [3.    3.    3.    3.    3.    3.    3.    3.    3.   ]
 [2.111 2.111 2.21  2.605 3.    3.395 3.79  3.889 3.889]
 [1.222 1.222 1.42  2.21  3.    3.79  4.58  4.778 4.778]
 [1.    1.    1.222 2.111 3.    3.889 4.778 5.    5.   ]
 [1.    1.    1.222 2.111 3.    3.889 4.778 5.    5.   ]]

@johnpjf
Copy link
johnpjf commented Jun 13, 2020

You can figure out what the params should be writing the equations out, so for your example you want the pixel at position 0.5 to stay at 0.5, and the one at 1.5 to transform to 8.5,
so:

0.5 = s * 0.5  + t
8.5 = s * 1.5 + t

Solving gives:

s = 8, t = -3.5

so:

scale = [8, 8]
translate = [-3.5, -3.5]
c3 = tf.raw_ops.ScaleAndTranslate(
          images=b,
          size=resize_shape,
          scale=scale,
          translation=translate,
          kernel_type='triangle',
          antialias=False)

Matches the tf1 behavior.

@dcouwenh
Copy link

@johnpjf Excellent -- thank you! This is exactly what I was looking for.

For completeness, I have edited my code to include your solution for arbitrary input and output shapes.

!conda list tensor*
import tensorflow as tf
import tensorflow.compat.v1 as tf_v1
import numpy as np
from tensorflow.python.ops import gen_image_ops

np.set_printoptions(precision=3)
np.set_printoptions(suppress=True)

Hi,Wi = (3,3)
Ho,Wo = (10,9)

a = np.arange(Hi*Wi).reshape(1, Hi, Wi, 1)

b = tf.constant(a, dtype=tf.float32)

c1 = tf_v1.image.resize_bilinear(b, (Ho,Wo), align_corners=True)
c2 = tf.image.resize(b, (Ho,Wo), method='bilinear', antialias=False)

sy = (Ho-1)/(Hi-1)
sx = (Wo-1)/(Wi-1)
scale = [sy,sx]
tx = (1-sx)/2
ty = (1-sy)/2
translate = [ty,tx]
c3 = gen_image_ops.scale_and_translate(
          b,
          (Ho,Wo),
          (sy,sx),
          (ty,tx),
          kernel_type='triangle',
          antialias=False)

c1 = c1.numpy()
c2 = c2.numpy()
c3 = c3.numpy()

print ("\nInput")
print (a[0, :, :, 0])
print ("Tensorflow 1:")
print (c1[0, :, :, 0])
print ("Tensorflow 2:")
print (c2[0, :, :, 0])
print ("scale_and_translate:")
print (c3[0, :, :, 0])

And here is the output:

# packages in environment at /home/dcouwenh/.conda/envs/hornet2:
#
# Name                    Version                   Build  Channel
tensorboard               2.2.1              pyh532a8cf_0  
tensorboard-plugin-wit    1.6.0                      py_0  
tensorflow                2.1.0           gpu_py37h7a4bb67_0  
tensorflow-base           2.1.0           gpu_py37h6c5654b_0  
tensorflow-estimator      2.1.0              pyhd54b08b_0  
tensorflow-gpu            2.1.0                h0d30ee6_0  

Input
[[0 1 2]
 [3 4 5]
 [6 7 8]]
Tensorflow 1:
[[0.    0.25  0.5   0.75  1.    1.25  1.5   1.75  2.   ]
 [0.667 0.917 1.167 1.417 1.667 1.917 2.167 2.417 2.667]
 [1.333 1.583 1.833 2.083 2.333 2.583 2.833 3.083 3.333]
 [2.    2.25  2.5   2.75  3.    3.25  3.5   3.75  4.   ]
 [2.667 2.917 3.167 3.417 3.667 3.917 4.167 4.417 4.667]
 [3.333 3.583 3.833 4.083 4.333 4.583 4.833 5.083 5.333]
 [4.    4.25  4.5   4.75  5.    5.25  5.5   5.75  6.   ]
 [4.667 4.917 5.167 5.417 5.667 5.917 6.167 6.417 6.667]
 [5.333 5.583 5.833 6.083 6.333 6.583 6.833 7.083 7.333]
 [6.    6.25  6.5   6.75  7.    7.25  7.5   7.75  8.   ]]
Tensorflow 2:
[[0.    0.    0.333 0.667 1.    1.333 1.667 2.    2.   ]
 [0.    0.    0.333 0.667 1.    1.333 1.667 2.    2.   ]
 [0.75  0.75  1.083 1.417 1.75  2.083 2.417 2.75  2.75 ]
 [1.65  1.65  1.983 2.317 2.65  2.983 3.317 3.65  3.65 ]
 [2.55  2.55  2.883 3.217 3.55  3.883 4.217 4.55  4.55 ]
 [3.45  3.45  3.783 4.117 4.45  4.783 5.117 5.45  5.45 ]
 [4.35  4.35  4.683 5.017 5.35  5.683 6.017 6.35  6.35 ]
 [5.25  5.25  5.583 5.917 6.25  6.583 6.917 7.25  7.25 ]
 [6.    6.    6.333 6.667 7.    7.333 7.667 8.    8.   ]
 [6.    6.    6.333 6.667 7.    7.333 7.667 8.    8.   ]]
scale_and_translate:
[[0.    0.25  0.5   0.75  1.    1.25  1.5   1.75  2.   ]
 [0.667 0.917 1.167 1.417 1.667 1.917 2.167 2.417 2.667]
 [1.333 1.583 1.833 2.083 2.333 2.583 2.833 3.083 3.333]
 [2.    2.25  2.5   2.75  3.    3.25  3.5   3.75  4.   ]
 [2.667 2.917 3.167 3.417 3.667 3.917 4.167 4.417 4.667]
 [3.333 3.583 3.833 4.083 4.333 4.583 4.833 5.083 5.333]
 [4.    4.25  4.5   4.75  5.    5.25  5.5   5.75  6.   ]
 [4.667 4.917 5.167 5.417 5.667 5.917 6.167 6.417 6.667]
 [5.333 5.583 5.833 6.083 6.333 6.583 6.833 7.083 7.333]
 [6.    6.25  6.5   6.75  7.    7.25  7.5   7.75  8.   ]]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stat:contribution welcome Status - Contributions welcome type:bug Bug
Projects
None yet
Development

No branches or pull requests

0