Linear Layer
The Linear layer (also known as a fully connected or dense layer) performs a linear transformation on the input data. It's one of the most fundamental building blocks in neural networks.
Definition
#![allow(unused)] fn main() { pub struct Linear { weight: Tensor, bias: Option<Tensor>, state: LayerState, } }
Constructor
#![allow(unused)] fn main() { pub fn new(in_features: usize, out_features: usize, with_bias: bool) -> Result<Self> }
Creates a new Linear layer with the specified dimensions.
Parameters:
in_features
: The size of each input sampleout_features
: The size of each output samplewith_bias
: Whether to include a bias term
Example:
#![allow(unused)] fn main() { let linear = Linear::new(784, 256, true)?; }
For more control over the initialization, you can use the extended constructor:
#![allow(unused)] fn main() { pub fn new_with_spec( in_features: usize, out_features: usize, with_bias: bool, device: Device, dtype: DType ) -> Result<Self> }
Additional Parameters:
device
: The device to place the layer's parameters on (CPU, CUDA, or MPS)dtype
: The data type for the layer's parameters
Example:
#![allow(unused)] fn main() { let linear = Linear::new_with_spec( 784, 256, true, Device::CUDA(0), DType::F32 )?; }
Forward Pass
#![allow(unused)] fn main() { pub fn forward(&self, input: &Tensor) -> Result<Tensor> }
Applies the linear transformation y = xW + b.
Parameters:
input
: The input tensor with shape [batch_size, ..., in_features]
Returns: Output tensor with shape [batch_size, ..., out_features]
Example:
#![allow(unused)] fn main() { let input = Tensor::new(vec![1.0, 2.0, 3.0, 4.0])?.reshape(&[2, 2])?; let linear = Linear::new(2, 3, true)?; let output = linear.forward(&input)?; // Shape: [2, 3] }
Parameter Access
#![allow(unused)] fn main() { pub fn weight(&self) -> &Tensor pub fn bias(&self) -> Option<&Tensor> }
Provides access to the layer's weight and bias parameters.
Example:
#![allow(unused)] fn main() { let linear = Linear::new(2, 3, true)?; let weight = linear.weight(); // Shape: [3, 2] let bias = linear.bias().unwrap(); // Shape: [3] }
Layer Implementation
The Linear layer implements the Layer
trait, providing methods for parameter collection and training state management:
#![allow(unused)] fn main() { pub fn parameters(&mut self) -> Vec<&mut Tensor> }
Returns all trainable parameters of the layer (weight and bias if present).
Mathematical Operation
For an input tensor x of shape [batch_size, in_features], the Linear layer computes:
output = x @ weight.T + bias
Where:
- @ represents the matrix multiplication
- weight.T is the transposed weight matrix of shape [out_features, in_features]
- bias is the bias vector of shape [out_features]
The output tensor has shape [batch_size, out_features].
Broadcasting Support
The Linear layer supports broadcasting for batched inputs. If the input tensor has additional leading dimensions, they are preserved in the output:
#![allow(unused)] fn main() { let input = Tensor::new(vec![1.0, 2.0, 3.0, 4.0, 5.0, 6.0])?.reshape(&[3, 2])?; let linear = Linear::new(2, 4, true)?; let output = linear.forward(&input)?; // Shape: [3, 4] }
For a more complex batch structure:
#![allow(unused)] fn main() { // Input shape: [batch_size, sequence_length, in_features] let input = Tensor::new(vec![/* values */])?.reshape(&[32, 10, 64])?; let linear = Linear::new(64, 128, true)?; let output = linear.forward(&input)?; // Shape: [32, 10, 128] }