python - 如何在keras中编写step_function作为激活函数？

已更新 感谢Q&A here ，我能够使用 tensorflow 构建工作步骤函数。 (见下面的代码)

现在我的问题变成了

How to make use of this tf_stepy activation function created in tensorflow to work in keras?

我尝试使用以下代码在keras中使用tf_stepy，但不起作用:

from tensorflow_step_function import tf_stepy

def buy_hold_sell(x):
    return tf_stepy(x)

get_custom_objects().update({'custom_activation': Activation(buy_hold_sell)})

下面是用tensorflow创建的步骤激活函数

# tensorflow_step_function.py
import tensorflow as tf
import keras.backend as K
from keras.backend.tensorflow_backend import _to_tensor
import numpy as np

def stepy(x):
    if x < 0.33:
        return 0.0
    elif x > 0.66:
        return 1.0
    else:
        return 0.5

import numpy as np
np_stepy = np.vectorize(stepy)

def d_stepy(x): # derivative
    if x < 0.33:
        return 0.0
    elif x > 0.66:
        return 1.0
    else:
        return 0.5
np_d_stepy = np.vectorize(d_stepy)

import tensorflow as tf
from tensorflow.python.framework import ops

np_d_stepy_32 = lambda x: np_d_stepy(x).astype(np.float32)

def py_func(func, inp, Tout, stateful=True, name=None, grad=None):

    # Need to generate a unique name to avoid duplicates:
    rnd_name = 'PyFuncGrad' + str(np.random.randint(0, 1E+8))

    tf.RegisterGradient(rnd_name)(grad)  # see _MySquareGrad for grad example
    g = tf.get_default_graph()
    with g.gradient_override_map({"PyFunc": rnd_name}):
        return tf.py_func(func, inp, Tout, stateful=stateful, name=name)

def tf_d_stepy(x,name=None):
    with ops.op_scope([x], name, "d_stepy") as name:
        y = tf.py_func(np_d_stepy_32,
                        [x],
                        [tf.float32],
                        name=name,
                        stateful=False)
        return y[0]

def stepygrad(op, grad):
    x = op.inputs[0]

    n_gr = tf_d_stepy(x)
    return grad * n_gr

np_stepy_32 = lambda x: np_stepy(x).astype(np.float32)

def tf_stepy(x, name=None):

    with ops.op_scope([x], name, "stepy") as name:
        y = py_func(np_stepy_32,
                        [x],
                        [tf.float32],
                        name=name,
                        grad=stepygrad)  # <-- here's the call to the gradient
        return y[0]

with tf.Session() as sess:

    x = tf.constant([0.2,0.7,0.4,0.6])
    y = tf_stepy(x)
    tf.initialize_all_variables().run()

    print(x.eval(), y.eval(), tf.gradients(y, [x])[0].eval())

原始问题

我想在keras中基于step函数的思想写一个激活函数，如下图

在 numpy 中，这样的步骤激活函数应如下所示:

def step_func(x, lower_threshold=0.33, higher_threshold=0.66):

    # x is an array, and return an array

    for index in range(len(x)):
        if x[index] < lower_threshold:
            x[index] = 0.0
        elif x[index] > higher_threshold:
            x[index] = 1.0
        else:
            x[index] = 0.5

我设法将步骤函数从 numpy 版本转换为 keras.tensor 版本。其工作原理如下:

import tensorflow as tf
import keras.backend as K
from keras.backend.tensorflow_backend import _to_tensor
import numpy as np
def high_med_low(x, lower_threshold=0.33, higher_threshold=0.66):
    """
    x: tensor
    return a tensor
    """
    # x_shape = K.get_variable_shape(x)
    # x_flat = K.flatten(x)
    x_array = K.get_value(x)
    for index in range(x_array.shape[0]):
        if x_array[index,0] < lower_threshold:
            x_array[index,0] = 0.0
        elif x_array[index,0] > higher_threshold:
            x_array[index,0] = 1.0
        else:
            x_array[index,0] = 0.5

    # x_return = x_array.reshape(x_shape)
    return _to_tensor(x_array, x.dtype.base_dtype)

x = K.ones((10,1)) * 0.7
print(high_med_low(x))

# the following line of code is used in building a model with keras
get_custom_objects().update({'custom_activation': Activation(high_med_low)})

尽管此函数本身可以工作，但应用于模型时会导致错误。我的怀疑是，作为激活层，它不应该访问张量的每个元素值。

如果是这样，那么编写此步骤激活函数的正确方法是什么？

谢谢!

最佳答案

这行不通。非线性仍然必须是可微的。阶跃函数不可微分，因此无法计算梯度。

您始终可以尝试构建一个近似步长的可微函数。这已经是 sigmoid 或 tanh 对“一步”版本所做的事情。

我希望这会有所帮助:)

关于python - 如何在keras中编写step_function作为激活函数？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/45142706/

python - 如何在keras中编写step_function作为激活函数？

上一篇：python - 如何通过多种形式过滤一个模型？

下一篇：python - 访问内层的输出值