► 开发者指南 / Sequential 模型

Sequential 模型

作者： fchollet
创建日期 2020/04/12
最后修改日期 2023/06/25
描述： Sequential 模型的完整指南。

设置

import keras
from keras import layers
from keras import ops

何时使用 Sequential 模型

一个 Sequential 模型适用于 一个简单的层堆栈，其中每一层都具有 恰好一个输入张量和一个输出张量。

从示意图上看，以下 Sequential 模型

# Define Sequential model with 3 layers
model = keras.Sequential(
    [
        layers.Dense(2, activation="relu", name="layer1"),
        layers.Dense(3, activation="relu", name="layer2"),
        layers.Dense(4, name="layer3"),
    ]
)
# Call model on a test input
x = ops.ones((3, 3))
y = model(x)

等同于此函数

# Create 3 layers
layer1 = layers.Dense(2, activation="relu", name="layer1")
layer2 = layers.Dense(3, activation="relu", name="layer2")
layer3 = layers.Dense(4, name="layer3")

# Call layers on a test input
x = ops.ones((3, 3))
y = layer3(layer2(layer1(x)))

Sequential 模型 不适合 用于以下情况：

你的模型有多个输入或多个输出
你的任何层有多个输入或多个输出
你需要进行层共享
你想要非线性拓扑结构（例如，残差连接、多分支模型）

创建 Sequential 模型

你可以通过向 Sequential 构造函数传递一个层列表来创建一个 Sequential 模型

model = keras.Sequential(
    [
        layers.Dense(2, activation="relu"),
        layers.Dense(3, activation="relu"),
        layers.Dense(4),
    ]
)

它的层可以通过 layers 属性访问

model.layers

[<Dense name=dense, built=False>,
 <Dense name=dense_1, built=False>,
 <Dense name=dense_2, built=False>]

你也可以通过 add() 方法逐步创建一个 Sequential 模型

model = keras.Sequential()
model.add(layers.Dense(2, activation="relu"))
model.add(layers.Dense(3, activation="relu"))
model.add(layers.Dense(4))

注意，还有一个对应的 pop() 方法来移除层：Sequential 模型非常类似于一个层列表。

model.pop()
print(len(model.layers))  # 2

另外请注意，Sequential 构造函数接受一个 name 参数，就像 Keras 中的任何层或模型一样。这对于使用具有语义意义的名称来标注 TensorBoard 图非常有用。

model = keras.Sequential(name="my_sequential")
model.add(layers.Dense(2, activation="relu", name="layer1"))
model.add(layers.Dense(3, activation="relu", name="layer2"))
model.add(layers.Dense(4, name="layer3"))

提前指定输入形状

通常，Keras 中的所有层都需要知道其输入的形状，才能创建它们的权重。因此，当你像这样创建一个层时，它最初没有权重

layer = layers.Dense(3)
layer.weights  # Empty

[]

当它第一次在输入上被调用时，就会创建其权重，因为权重的形状取决于输入的形状

# Call layer on a test input
x = ops.ones((1, 4))
y = layer(x)
layer.weights  # Now it has weights, of shape (4, 3) and (3,)

[<KerasVariable shape=(4, 3), dtype=float32, path=dense_6/kernel>,
 <KerasVariable shape=(3,), dtype=float32, path=dense_6/bias>]

当然，这也适用于 Sequential 模型。当你实例化一个没有指定输入形状的 Sequential 模型时，它并没有被“构建”：它没有权重（并且调用 model.weights 会导致一个说明这一点的错误）。权重是在模型第一次看到一些输入数据时创建的

model = keras.Sequential(
    [
        layers.Dense(2, activation="relu"),
        layers.Dense(3, activation="relu"),
        layers.Dense(4),
    ]
)  # No weights at this stage!

# At this point, you can't do this:
# model.weights

# You also can't do this:
# model.summary()

# Call the model on a test input
x = ops.ones((1, 4))
y = model(x)
print("Number of weights after calling the model:", len(model.weights))  # 6

Number of weights after calling the model: 6

模型一旦被“构建”，你可以调用它的 summary() 方法来显示其内容

model.summary()

Model: "sequential_3"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape              ┃    Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ dense_7 (Dense)                 │ (1, 2)                    │         10 │
├─────────────────────────────────┼───────────────────────────┼────────────┤
│ dense_8 (Dense)                 │ (1, 3)                    │          9 │
├─────────────────────────────────┼───────────────────────────┼────────────┤
│ dense_9 (Dense)                 │ (1, 4)                    │         16 │
└─────────────────────────────────┴───────────────────────────┴────────────┘

 Total params: 35 (140.00 B)

 Trainable params: 35 (140.00 B)

 Non-trainable params: 0 (0.00 B)

然而，在逐步构建 Sequential 模型时，能够显示模型的当前摘要（包括当前的输出形状）会非常有用。在这种情况下，你应该通过向模型传递一个 Input 对象来启动你的模型，这样它从一开始就知道其输入形状

model = keras.Sequential()
model.add(keras.Input(shape=(4,)))
model.add(layers.Dense(2, activation="relu"))

model.summary()

Model: "sequential_4"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape              ┃    Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ dense_10 (Dense)                │ (None, 2)                 │         10 │
└─────────────────────────────────┴───────────────────────────┴────────────┘

 Total params: 10 (40.00 B)

 Trainable params: 10 (40.00 B)

 Non-trainable params: 0 (0.00 B)

注意，Input 对象不显示在 model.layers 中，因为它不是一个层

model.layers

[<Dense name=dense_10, built=True>]

像这样预先指定输入形状构建的模型总是拥有权重（即使在看到任何数据之前），并且总是具有定义的输出形状。

一般来说，如果已知 Sequential 模型的输入形状，建议始终提前指定它。

一个常见的调试工作流程：`add()` + `summary()`

在构建新的 Sequential 架构时，通过 add() 逐步堆叠层并经常打印模型摘要会非常有用。例如，这可以让你监控由 Conv2D 和 MaxPooling2D 层组成的堆栈如何对图像特征图进行下采样

model = keras.Sequential()
model.add(keras.Input(shape=(250, 250, 3)))  # 250x250 RGB images
model.add(layers.Conv2D(32, 5, strides=2, activation="relu"))
model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.MaxPooling2D(3))

# Can you guess what the current output shape is at this point? Probably not.
# Let's just print it:
model.summary()

# The answer was: (40, 40, 32), so we can keep downsampling...

model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.MaxPooling2D(3))
model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.MaxPooling2D(2))

# And now?
model.summary()

# Now that we have 4x4 feature maps, time to apply global max pooling.
model.add(layers.GlobalMaxPooling2D())

# Finally, we add a classification layer.
model.add(layers.Dense(10))

Model: "sequential_5"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape              ┃    Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ conv2d (Conv2D)                 │ (None, 123, 123, 32)      │      2,432 │
├─────────────────────────────────┼───────────────────────────┼────────────┤
│ conv2d_1 (Conv2D)               │ (None, 121, 121, 32)      │      9,248 │
├─────────────────────────────────┼───────────────────────────┼────────────┤
│ max_pooling2d (MaxPooling2D)    │ (None, 40, 40, 32)        │          0 │
└─────────────────────────────────┴───────────────────────────┴────────────┘

 Total params: 11,680 (45.62 KB)

 Trainable params: 11,680 (45.62 KB)

 Non-trainable params: 0 (0.00 B)

Model: "sequential_5"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape              ┃    Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ conv2d (Conv2D)                 │ (None, 123, 123, 32)      │      2,432 │
├─────────────────────────────────┼───────────────────────────┼────────────┤
│ conv2d_1 (Conv2D)               │ (None, 121, 121, 32)      │      9,248 │
├─────────────────────────────────┼───────────────────────────┼────────────┤
│ max_pooling2d (MaxPooling2D)    │ (None, 40, 40, 32)        │          0 │
├─────────────────────────────────┼───────────────────────────┼────────────┤
│ conv2d_2 (Conv2D)               │ (None, 38, 38, 32)        │      9,248 │
├─────────────────────────────────┼───────────────────────────┼────────────┤
│ conv2d_3 (Conv2D)               │ (None, 36, 36, 32)        │      9,248 │
├─────────────────────────────────┼───────────────────────────┼────────────┤
│ max_pooling2d_1 (MaxPooling2D)  │ (None, 12, 12, 32)        │          0 │
├─────────────────────────────────┼───────────────────────────┼────────────┤
│ conv2d_4 (Conv2D)               │ (None, 10, 10, 32)        │      9,248 │
├─────────────────────────────────┼───────────────────────────┼────────────┤
│ conv2d_5 (Conv2D)               │ (None, 8, 8, 32)          │      9,248 │
├─────────────────────────────────┼───────────────────────────┼────────────┤
│ max_pooling2d_2 (MaxPooling2D)  │ (None, 4, 4, 32)          │          0 │
└─────────────────────────────────┴───────────────────────────┴────────────┘

 Total params: 48,672 (190.12 KB)

 Trainable params: 48,672 (190.12 KB)

 Non-trainable params: 0 (0.00 B)

非常实用，对吧？

拥有模型后该做什么

一旦你的模型架构准备就绪，你会想要

训练你的模型，评估它，并运行推理。请参阅我们的使用内置循环进行训练和评估指南
将你的模型保存到磁盘并恢复它。请参阅我们的序列化与保存指南。

使用 Sequential 模型进行特征提取

一旦 Sequential 模型被构建，它就像一个函数式 API 模型。这意味着每一层都有一个 input 和 output 属性。这些属性可用于执行巧妙的操作，例如快速创建一个提取 Sequential 模型中所有中间层输出的模型

initial_model = keras.Sequential(
    [
        keras.Input(shape=(250, 250, 3)),
        layers.Conv2D(32, 5, strides=2, activation="relu"),
        layers.Conv2D(32, 3, activation="relu"),
        layers.Conv2D(32, 3, activation="relu"),
    ]
)
feature_extractor = keras.Model(
    inputs=initial_model.inputs,
    outputs=[layer.output for layer in initial_model.layers],
)

# Call feature extractor on test input.
x = ops.ones((1, 250, 250, 3))
features = feature_extractor(x)

这里有一个类似的例子，它只从一个层提取特征

initial_model = keras.Sequential(
    [
        keras.Input(shape=(250, 250, 3)),
        layers.Conv2D(32, 5, strides=2, activation="relu"),
        layers.Conv2D(32, 3, activation="relu", name="my_intermediate_layer"),
        layers.Conv2D(32, 3, activation="relu"),
    ]
)
feature_extractor = keras.Model(
    inputs=initial_model.inputs,
    outputs=initial_model.get_layer(name="my_intermediate_layer").output,
)
# Call feature extractor on test input.
x = ops.ones((1, 250, 250, 3))
features = feature_extractor(x)

使用 Sequential 模型进行迁移学习

迁移学习包括冻结模型中的底部层，并仅训练顶部层。如果你对此不熟悉，请务必阅读我们的迁移学习指南。

这里有两种常见的涉及 Sequential 模型的迁移学习蓝图。

首先，假设你有一个 Sequential 模型，并且你想冻结除最后一层之外的所有层。在这种情况下，你可以简单地迭代 model.layers 并对每个层设置 layer.trainable = False，除了最后一层。像这样

model = keras.Sequential([
    keras.Input(shape=(784)),
    layers.Dense(32, activation='relu'),
    layers.Dense(32, activation='relu'),
    layers.Dense(32, activation='relu'),
    layers.Dense(10),
])

# Presumably you would want to first load pre-trained weights.
model.load_weights(...)

# Freeze all layers except the last one.
for layer in model.layers[:-1]:
  layer.trainable = False

# Recompile and train (this will only update the weights of the last layer).
model.compile(...)
model.fit(...)

另一种常见的蓝图是使用 Sequential 模型来堆叠一个预训练模型和一些新初始化的分类层。像这样

# Load a convolutional base with pre-trained weights
base_model = keras.applications.Xception(
    weights='imagenet',
    include_top=False,
    pooling='avg')

# Freeze the base model
base_model.trainable = False

# Use a Sequential model to add a trainable classifier on top
model = keras.Sequential([
    base_model,
    layers.Dense(1000),
])

# Compile & train
model.compile(...)
model.fit(...)