This project implements a Lunar Lander game using Python and Pygame. It features manual keyboard controls and an optional autoplay mode driven by a neural network trained with a genetic algorithm. The simulation incorporates basic physics, including gravity and thrust mechanics, providing a challenging landing experience. The neural network interacts with a C++ backend via pybind11 for performance.
The project is structured into several Python modules handling configuration, game logic, visuals, and neural network training, along with a C++ interface file.
The script main.py
serves as the entry point for the application. It handles command-line arguments to select the run mode (play
, nn_train
, nn_play
), initializes Pygame, sets up the display, sound (optional), game logic, and visual components.
The core game_loop
function manages the main game cycle, handling events (keyboard input, quit), determining actions (either from the player or the NN), updating the game state via GameLogic
, and rendering the scene using draw_game
.
# Simplified game loop structure from main.py
def game_loop(mode: str):
# ... Initialization (Pygame, screen, clock, font, sounds, logic, visuals) ...
# ... NN Initialization if mode == 'nn_play' ...
running = True
game_over = False
has_started = False # Track if player/NN initiated movement
while running:
clock.tick(cfg.fps)
action = 0 # Default: Noop
# --- Event Handling ---
for event in pygame.event.get():
# ... Handle QUIT, Keydown (Q for quit) ...
if mode == 'play' and event.type == pygame.KEYDOWN:
if event.key in [pygame.K_UP, pygame.K_LEFT, pygame.K_RIGHT]:
has_started = True # Start game on first player input
# --- Action Determination ---
if not game_over:
if mode == 'play':
keys = pygame.key.get_pressed()
if keys[pygame.K_UP]: action = 1
elif keys[pygame.K_LEFT]: action = 2
elif keys[pygame.K_RIGHT]: action = 3
elif mode == 'nn_play':
current_state = logic.get_state()
nn_action = NN.get_action(current_state)
# ... Logic to handle starting the game automatically ...
if has_started:
action = nn_action # Use NN action if game started
# --- Game Logic Update ---
if has_started and not game_over:
state, done = logic.update(action) # Update physics and check state
if done:
game_over = True
# ... Print landing/crash message ...
# --- Rendering ---
draw_game(screen, logic, visuals, sounds, font)
# --- Save Frame if enabled ---
# ... pygame.image.save(...) ...
# --- Game Over Handling ---
if game_over:
pygame.time.delay(1500)
running = False
pygame.quit()
# ... main function with argparse ...
The draw_game
function takes the current game state and renders the terrain, landing pads, lander, thrust flames, and HUD elements (fuel, position, velocity).
# Snippet from draw_game in main.py
def draw_game(screen: pygame.Surface, logic: GameLogic, visuals: LanderVisuals,
sounds: SimpleNamespace, font: pygame.font.Font):
render_info = logic.get_render_info()
# ... Extract x, y, fuel, vx, vy, last_action ...
screen.fill(c.k) # Clear screen
# Draw terrain & pads
# ... pygame.draw.rect(...) ...
# Draw Lander
scaled_images = visuals.get_scaled_images()
screen.blit(scaled_images["lander"], (lander_x, lander_y))
# Draw Flames based on last action and fuel
# ... screen.blit(scaled_images["vflames"], ...) ...
# ... screen.blit(scaled_images["rflames"], ...) ...
# ... screen.blit(scaled_images["lflames"], ...) ...
# Play sound if thrusting
# ... sounds.engine_s.play() ...
# Draw HUD
# ... font.render(...) ...
# ... screen.blit(...) ...
pygame.display.flip()
All configuration are centralizes in a module. It defines: - lander_cfg
: Lander properties (dimensions, flame offsets, fuel, image name). - cfg
: General settings (window size, FPS, sound enabled, image saving). - game_cfg
: Game physics and layout (initial state, pad positions/sizes, landing constraints, max steps). Includes logic for randomizing pad positions (reset_pad_positions
, _generate_random_pad_positions
). - planet_cfg
: Environmental physics (gravity, friction). - nn_config
: Neural network and genetic algorithm parameters (name, layers, population, GA settings, saving options, training parameters).
# Example: Game Configuration from mod_config.py
game_cfg = SimpleNamespace(
random_position=True, # set randomly the pad
x0=np.array([52.0, cfg.height - lander_cfg.height - 52]), # initial position
v0=np.array([0.0, 0.0]), # initial velocity
a0=np.array([0.0, 0.0]), # initial acceleration
spad_x1=50, # takeoff pad left boundary
spad_width=80, # takeoff pad width
lpad_x1=cfg.width - 400, # landing pad left boundary
lpad_width=200, # landing pad width
pad_y1=cfg.height - 50, # landing/ takeoff pad top boundary
pad_height=10, # landing/ takeoff pad height
terrain_y=50, # terrain height
max_vx=0.5, # max horizontal speed for landing
max_vy=2.0, # max vertical speed for landing
max_steps=1000, # max steps per episode
current_seed=None # Seed for pad positions
)
# Example: NN Configuration from mod_config.py
nn_config = SimpleNamespace(
name="lunar_lander", # nn name
hlayers=[8, 64, 16], # hidden layer structure (Input: 6, Output: 4)
seed=5247, # seed
top_individuals=10, # GA: top individuals selected
population_size=300, # GA: population size
mixed_population=True, # GA: use mixed population
elitism=True, # GA: keep the best individual
activation_id=1, # NN: TANH activation
save_nn=True, # save NN state
overwrite=False, # save overwriting previous generation
save_path_nn="./data/", # save path
save_interval=25, # save every n generations
epochs=1000, # GA: number of training generations
nb_batches=100, # Reset pads every N generations if random
fit_min=-2000, # Fitness threshold to detect stagnation
fit_streak=5, # Generations below threshold to trigger pad reset
verbose=False
)
The GameLogic
class encapsulates the state and physics of the lander. - reset
: Sets initial position, velocity, fuel, and landing status based on game_cfg
. Recalculates landing pad center. - _apply_action
: Modifies velocity and fuel based on the chosen action (0: Noop, 1: Up, 2: Left, 3: Right). - _update_physics
: Updates velocity based on gravity and friction, then updates position. Prevents the lander from going below ground level. - _check_landing_crash
: Determines if the lander has touched the ground and checks if it’s within the landing pad boundaries and below the maximum velocity thresholds (max_vx
, max_vy
). Sets landed
, crashed
, and landed_successfully
flags. - update
: The main step function called by the game loop. It applies the action, updates physics, checks for landing/crash, increments the time step, and returns the new state and a ‘done’ flag. - is_done
: Returns True
if the episode ended (landed or crashed). - get_state
: Computes and returns the current state vector normalized for the neural network (velocities, distance to pad, fuel, vertical acceleration). - get_render_info
: Returns a dictionary with the necessary information for drawing the game state.
# Snippet from GameLogic.update in mod_game_logic.py
def update(self, action: int):
if self.is_done():
return self.get_state(), True # Return current state if already done
self._apply_action(action)
self._update_physics()
self._check_landing_crash()
self.time_step += 1
done = self.is_done()
state = self.get_state() # Get state for NN
return state, done
# Snippet from GameLogic.get_state in mod_game_logic.py
def get_state(self) -> np.ndarray:
dist_target_x = self.x - self.landing_pad_center_x
dist_target_y = self.y - self.landing_pad_y
state = np.array([
self.vx / gcfg.max_vx, # Normalized Vx
self.vy / gcfg.max_vy, # Normalized Vy
dist_target_x / cfg.width, # Normalized distance X to pad center
dist_target_y / cfg.height, # Normalized distance Y to pad top
self.fuel / lcfg.max_fuel, # Normalized Fuel
self.ay / pcfg.g if pcfg.g != 0 else self.ay # Normalized vertical acceleration
], dtype=float)
# Clip normalized velocities to avoid extremes
state[0] = np.clip(state[0], -2, 2)
state[1] = np.clip(state[1], -2, 2)
state[2] = np.clip(state[2], -2, 2) # Clip distance X
state[3] = np.clip(state[3], -2, 2) # Clip distance Y
return state
The LanderVisuals
class is responsible for loading and providing the lander and flame images. - __init__
: Loads the lander image (lander_1.png
or lander_2.png
based on lander_cfg
) and flame images (vertical_flames.png
, horizontal_flames_l.png
, horizontal_flames_r.png
) using Pygame. It also pre-scales these images based on dimensions defined in lander_cfg
. - get_scaled_images
: Returns a dictionary containing the pre-scaled Pygame surfaces for easy access during rendering.
# Structure of LanderVisuals in mod_lander.py
class LanderVisuals:
def __init__(self):
# Store dimensions from lcfg
self.width = lcfg.width
# ... other dimensions ...
# Load assets using pygame.image.load
try:
self.image = pygame.image.load('assets/png/' + lcfg.img + '.png').convert_alpha()
self.vflames = pygame.image.load('assets/png/vertical_flames.png').convert_alpha()
# ... load lflames, rflames ...
except pygame.error as e:
print(f"Error loading lander assets: {e}")
# Pre-scale images using pygame.transform.scale
self.lander_img_scaled = pygame.transform.scale(self.image, (self.width, self.height))
# ... scale vflames, lflames, rflames ...
def get_scaled_images(self):
return {
"lander": self.lander_img_scaled,
"vflames": self.vflames_img_scaled,
"lflames": self.lflames_img_scaled,
"rflames": self.rflames_img_scaled
}
The module mod_nn_train.py
defines the NeuralNetwork
class, which acts as a Python wrapper around the C++ neural network/GA library (cpp_nn_py
). - init
: Creates a new C++ ANN_MLP_GA_double
object with the network structure (_nnsize
derived from nn_config
), population size, and other GA parameters. Creates the initial population. - load
: Loads a previously saved network state (.hd5
file) using the C++ object’s Deserialize
method. Retrieves the generation count from the loaded network. - save
: Saves the current network state to an .hd5
file using Serialize
. Creates a _last.hd5
copy for easy access to the latest model. - _calculate_terminal_penalty
: Computes the fitness penalty/reward at the end of an episode based on success/failure, fuel remaining, final velocity, and steps taken. Lower scores are better. - train
: Runs the genetic algorithm training loop. For each generation: - Optionally resets pad positions based on nn_config.nb_batches
. - Iterates through each member of the population: - Runs a full game simulation using GameLogic
. - Gets actions from the current member’s NN using feedforward
. - Calculates step penalties (e.g., distance from pad). - Calculates the terminal penalty using _calculate_terminal_penalty
. - Stores the final fitness score (step penalties + terminal penalty). - Sorts members by fitness (lower is better). - Updates the C++ GA object’s weights and biases using UpdateWeightsAndBiases
. - Creates the next generation’s population using CreatePopulation
. - Updates and prints generation statistics. - Checks for fitness stagnation and potentially resets pads. - Saves the network periodically. - get_action
: Takes the current game state, feeds it to the C++ network (feedforward
for member 0, assumed to be the best), and returns the action corresponding to the highest output neuron.
# Snippet from NeuralNetwork.train in mod_nn_train.py
def train(self):
# ... Initialization checks ...
print(f"Starting training for {cfg.epochs} generations...")
# ... Print config details ...
population_size = self._net.GetPopSize()
for gen in range(self._nGen, self._nGen + cfg.epochs):
# ... Optional pad reset logic ...
fitness_scores = np.zeros(population_size, dtype=np.float64)
all_steps = []
print(f"\n--- Generation {gen + 1} ---")
for member_id in range(population_size):
game_sim = GameLogic(no_print=True) # Simulate game for this member
state = game_sim.get_state()
done = False
accumulated_step_penalty = 0.0
steps = 0
while not done and steps < game_cfg.max_steps:
inputs = np.array(state, dtype=np.float64)
outputs = np.zeros(self._nnsize[-1], dtype=np.float64)
# Get action from this member's NN
self._net.feedforward(inputs, outputs, member_id, False)
action = np.argmax(outputs)
next_state, done = game_sim.update(action) # Update game
# Calculate step penalties (e.g., distance)
dist_x = abs(game_sim.x - game_sim.landing_pad_center_x)
dist_y = abs(game_sim.y - game_sim.landing_pad_y)
accumulated_step_penalty += (dist_x + dist_y) * 0.001
state = next_state
steps += 1
if done: break
# Calculate terminal penalty
terminal_penalty = self._calculate_terminal_penalty(game_sim, steps)
fitness_scores[member_id] = accumulated_step_penalty + terminal_penalty
all_steps.append(steps)
# Update GA based on fitness
sorted_indices = np.argsort(fitness_scores) # Lower is better
self._net.UpdateWeightsAndBiases(sorted_indices)
self._net.CreatePopulation(cfg.elitism)
self._net.UpdateEpochs(1)
self._nGen = self._net.GetEpochs()
# Print stats, check stagnation, save periodically
# ...
print("\nTraining finished.")
I uses pybind11
to create Python bindings for the C++ ANN_MLP_GA
template class (specialized for float
and double
). It exposes the C++ class constructor, methods for configuration (SetName, SetEpochs), state management (Serialize, Deserialize, GetEpochs), GA operations (UpdateWeightsAndBiases, CreatePopulation), and the core feedforward
function to Python. This allows the Python code (mod_nn_train.py
) to instantiate and interact with the C++ neural network objects.
// Snippet from ann_mlp_ga_py_interface.h
PYBIND11_MODULE(cpp_nn_py, m)
{
// expose ANN_MLP_GA<double> to Python
py::class_<nn::ANN_MLP_GA<double>>(m, "ANN_MLP_GA_double")
.def(py::init<>())
// Constructor with network size, seed, pop size, top performers, activation, elitism
.def(py::init<std::vector<size_t>, int, size_t, size_t, size_t, bool>())
// ... other methods like PrintNetworkInfo, SetName, SetEpochs ...
.def("Serialize", &nn::ANN_MLP_GA<double>::Serialize)
.def("Deserialize", &nn::ANN_MLP_GA<double>::Deserialize)
.def("GetPopSize", &nn::ANN_MLP_GA<double>::GetPopSize)
.def("GetEpochs", &nn::ANN_MLP_GA<double>::GetEpochs)
.def("UpdateWeightsAndBiases", &nn::ANN_MLP_GA<double>::UpdateWeightsAndBiases)
// Expose feedforward, handling numpy arrays
.def("feedforward",
[](nn::ANN_MLP_GA<double>& self, py::array_t<double> inputs, py::array_t<double> outputs, size_t memberid,
bool singleReturn) {
// ... get raw pointers from numpy arrays ...
const double* pInputs = inputs.data();
double* pOutputs = outputs.mutable_data();
size_t inputsSize = static_cast<size_t>(inputs.size());
size_t outputsSize = static_cast<size_t>(outputs.size());
// Call C++ method
self.feedforward(pInputs, inputsSize, pOutputs, outputsSize, memberid, singleReturn);
},
py::arg("inputs"), py::arg("outputs"), py::arg("memberid"), py::arg("singleReturn"))
// ... TrainGA, TestGA (not used in Python?), SetMixed, CreatePopulation ...
;
}
The complete code is available on GitHub. The repository contains detailed instructions on how to set up the environment, compile, run the training and the tests and it is available at the following link.