It is possible to nest a tf.map_fn
within another tf.map_fn
call. However, if the input tensor is a RaggedTensor
and there is no function signature provided, code assumes the output is a fully specified tensor and fills output buffer with uninitialized contents from the heap:
import tensorflow as tf
x = tf.ragged.constant([[1,2,3], [4,5], [6]])
t = tf.map_fn(lambda r: tf.map_fn(lambda y: r, r), x)
z = tf.ragged.constant([[[1,2,3],[1,2,3],[1,2,3]],[[4,5],[4,5]],[[6]]])
The t
and z
outputs should be identical, however this is not the case. The last row of t
contains data from the heap which can be used to leak other memory information.
The bug lies in the conversion from a Variant
tensor to a RaggedTensor
. The implementation does not check that all inner shapes match and this results in the additional dimensions in the above example.
The same implementation can result in data loss, if input tensor is tweaked:
import tensorflow as tf
x = tf.ragged.constant([[1,2], [3,4,5], [6]])
t = tf.map_fn(lambda r: tf.map_fn(lambda y: r, r), x)
Here, the output tensor will only have 2 elements for each inner dimension.
We have patched the issue in GitHub commit 4e2565483d0ffcadc719bd44893fb7f609bb5f12.
The fix will be included in TensorFlow 2.6.0. We will also cherrypick this commit on TensorFlow 2.5.1, TensorFlow 2.4.3, and TensorFlow 2.3.4, as these are also affected and still in supported range.
Please consult our security guide for more information regarding the security model and how to contact us with issues and questions.
This vulnerability has been reported by Haris Sahovic.