Understanding keras layer
Keras have a bunch of high level layers which very convenient to create variance of models, this article describe two things :
- the concept and design of keras layer
- how keras layer mapping to tensorflow backend
Layer
Usually two steps to create a layer,
- initialize an instance by run
__init__()
method. - call the instance by run
__call__()
method, this is the main step to make a keras layer.
like below:
or
There are three parts for __call__()
Input tensor
The parameter pass to it, it can be tensors from previous layers, or initial input palceholders.
if it is initial input palceholders, __init__()
method will initialize a tf placeholder and wrap it as a keras input tensor.
Output tensor
The return value of __call__()
method, output tensor(s) is calculated from input tensor,the calculate logic is defined in call()
, every layer should implement a call()
method to calculate it’s output layer, if you customize a layer yourself, this the most important method need to implement.
Here we use Dense layer as a example see how it works:
it simply calculate output by dot product input tensor with weights(here is self.kernel), exactly behavior as a dense layer to do.
layer instance itself.
Layer itself is not a tensor, it holds weights which need by tensor operations, and also hold the logic to do operations.
every layer have a build()
methods, if you customize a layer yourself, and this layer have weights, then you need build()
method.
Use the Dense layer as example, build()
function doing something like:
it create a weight for Dense layer. and for upper example, we can see the weight are used by dot product with input tensor.
Container
OK, now we have a layer with input tensor and output tensor, when we chain many layers together, and call the Model function API
in general, keras use directed acyclic graph(DAG) to represent model, and the DAG
have a name in keras called Container
, Model
class is derived from Container
.
several points container have:
- facility to save/load weights and the DAG architecture, convert to json etc.
- mapping to tf loss and train.
- input size is flexible
Conclude
It’s not complex to read keras source code, and the design is clear, concept behind it is very simple, but this make the powerful high level interface to build many kind of neural networks.