Feedforward Neural Networks
A feedforward network (also called multi-layer perceptron) is the simplest neural network architecture. Data flows in one direction: input to hidden layers to output. These networks form the foundation for all deep learning architectures used in trading.
Neural Network Basics
The Neuron
A neural network is composed of layers of artificial neurons. Each neuron:
- Takes multiple inputs (x1, x2, …, xn)
- Multiplies each by a weight (w1, w2, …, wn)
- Adds a bias term (b)
- Applies an activation function (sigma)
output = sigma(w1*x1 + w2*x2 + ... + wn*xn + b)
The universal approximation theorem states that a feedforward network with a single hidden layer containing a finite number of neurons can approximate any continuous function. In practice, however, deeper networks with fewer neurons per layer tend to generalize better.
Activation Functions
Common activation functions used in trading networks:
| Function | Formula | Use Case |
|---|---|---|
| ReLU | max(0, x) | Hidden layers (most common) |
| Tanh | (ex - e-x)/(ex + e-x) | Hidden layers (centered output) |
| Leaky ReLU | max(0.01x, x) | Prevents dying neurons |
| ELU | x if x > 0 else alpha(ex - 1) | Smooth, self-normalizing |
| Sigmoid | 1/(1 + e-x) | Binary classification output |
| Softmax | exi / Sum(exj) | Multi-class classification |
For regression tasks (predicting returns), we typically use no activation on the output layer.
ReLU is the default choice for hidden layers. Only switch to Leaky ReLU or ELU if you observe “dying neurons” (many neurons outputting zero). For output layers, use no activation for regression, sigmoid for binary classification, and softmax for multi-class.
PyTorch vs TensorFlow
Both frameworks are widely used in finance. The puffin.deep module provides implementations in both.
PyTorch
Pros:
- More Pythonic and intuitive
- Dynamic computational graphs
- Easier debugging
- Popular in research
Cons:
- Smaller production ecosystem
- Less enterprise tooling
TensorFlow/Keras
Pros:
- Larger production ecosystem
- TensorFlow Serving for deployment
- Better mobile/edge device support
- High-level Keras API
Cons:
- Steeper learning curve
- More verbose for research
Architecture for Return Prediction
The TradingFFN class provides a high-level interface for building and training feedforward networks:
from puffin.deep import TradingFFN
import numpy as np
# Generate features
X = np.random.randn(1000, 50) # 1000 samples, 50 features
y = np.random.randn(1000) # Target returns
# Create model with 2 hidden layers
model = TradingFFN(
input_dim=50,
hidden_dims=[64, 32], # 2 hidden layers
output_dim=1, # Single output (return prediction)
dropout=0.3, # 30% dropout
activation='relu' # ReLU activation
)
# Train model
history = model.fit(
X, y,
epochs=100,
lr=0.001, # Learning rate
batch_size=64,
validation_split=0.2,
verbose=True
)
# Make predictions
predictions = model.predict(X)
Always use time-series aware train/test splits when working with financial data. Random splits introduce look-ahead bias, where future information leaks into the training set. Use chronological splits or walk-forward validation instead.
PyTorch Implementation Details
The FeedforwardNet class provides full PyTorch flexibility for custom workflows:
from puffin.deep import FeedforwardNet
import torch
# Create raw PyTorch model
model = FeedforwardNet(
input_dim=50,
hidden_dims=[64, 32, 16],
output_dim=1,
dropout=0.3,
activation='relu'
)
# Manual forward pass
x = torch.randn(32, 50) # Batch of 32
output = model(x) # Returns (32, 1)
# Access network layers
print(model.network)
The FeedforwardNet internally constructs a sequential stack of layers: Linear -> BatchNorm1d -> Activation -> Dropout for each hidden layer, followed by a final Linear output layer.
TensorFlow Implementation
For teams preferring TensorFlow/Keras, the puffin.deep.feedforward_tf module provides an equivalent API:
from puffin.deep.feedforward_tf import TradingFFN_TF, check_tensorflow_available
# Check if TensorFlow is installed
if check_tensorflow_available():
model = TradingFFN_TF(
input_dim=50,
hidden_dims=[64, 32],
output_dim=1,
dropout=0.3,
activation='relu'
)
history = model.fit(
X, y,
epochs=100,
lr=0.001,
batch_size=64,
validation_split=0.2
)
predictions = model.predict(X)
else:
print("TensorFlow not installed")
The TensorFlow implementation gracefully falls back if TensorFlow is not installed, making the codebase more portable. All
puffin.deepclasses default to PyTorch, with TensorFlow offered as an optional alternative.
Complete Example: Predicting Stock Returns
Here is a full pipeline for training a neural network to predict next-day returns:
import numpy as np
import pandas as pd
from sklearn.preprocessing import StandardScaler
from puffin.data import YFinanceProvider
from puffin.ml import compute_features
from puffin.deep import TradingFFN
from puffin.deep.training import set_seed
# Set random seed for reproducibility
set_seed(42)
# Load data
data = YFinanceProvider().fetch('AAPL', start='2020-01-01', end='2023-12-31')
# Compute features
features = compute_features(data)
# Create target: next-day return
target = data['Close'].pct_change().shift(-1)
# Remove NaNs
mask = features.notna().all(axis=1) & target.notna()
X = features[mask].values
y = target[mask].values
# Standardize features
scaler = StandardScaler()
X = scaler.fit_transform(X)
# Train/test split (time-series aware)
split_idx = int(len(X) * 0.8)
X_train, X_test = X[:split_idx], X[split_idx:]
y_train, y_test = y[:split_idx], y[split_idx:]
# Create and train model
model = TradingFFN(
input_dim=X_train.shape[1],
hidden_dims=[128, 64, 32],
output_dim=1,
dropout=0.4,
activation='relu'
)
history = model.fit(
X_train, y_train,
epochs=100,
lr=0.001,
batch_size=64,
validation_split=0.2,
verbose=True
)
# Evaluate on test set
test_predictions = model.predict(X_test)
# Compute test metrics
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
test_mse = mean_squared_error(y_test, test_predictions)
test_mae = mean_absolute_error(y_test, test_predictions)
test_r2 = r2_score(y_test, test_predictions)
print(f"Test MSE: {test_mse:.6f}")
print(f"Test MAE: {test_mae:.6f}")
print(f"Test R2: {test_r2:.4f}")
# Save model
model.save('models/aapl_return_predictor')
# Load later
loaded_model = TradingFFN.load('models/aapl_return_predictor')
When evaluating return prediction models, R-squared values are typically very low (often below 0.05) for daily returns. This is normal for financial data. Even a small positive R-squared can be economically significant when combined with proper position sizing and risk management.
Architecture Design Guidelines
When designing feedforward networks for trading:
| Guideline | Recommendation |
|---|---|
| Depth | Start with 2-3 hidden layers |
| Width | First hidden layer 1-2x the input dimension |
| Narrowing | Each successive layer should be narrower |
| Dropout | 0.3-0.5 for financial data |
| Activation | ReLU for hidden layers, none for regression output |
| Output | Single neuron for return prediction, 2-3 for classification |
Source Code
- Feedforward network (PyTorch):
puffin/deep/feedforward.py - Feedforward network (TensorFlow):
puffin/deep/feedforward_tf.py - Module init:
puffin/deep/__init__.py